set targets will be expected. VARIED DAY TO DAY RESPONSIBILITIES Ensuring system reliability, performance, and scalability through monitoring and automation Building and maintaining observability solutions using Grafana, Prometheus, Loki, OpenTelemetry Proactively identifying and resolving performance bottlenecks and infrastructure issues Automating infrastructure provisioning, configuration management, and deployments Implementing effective logging, monitoring, and alerting strategies Managing incident response and post-mortem processes … robust observability, monitoring and logging solutions Strong proficiency with observability and monitoring tools such as Grafana, Prometheus, and Loki Strong experience with distributed tracing and telemetry tools such as OpenTelemetry An understanding of cloud networking architecture and load balancing techniques Experience with container orchestration platforms like Kubernetes Proficiency in infrastructure as code (IaC) tools such as Terraform or Ansible Strong More ❯
Logic, New Relic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL Good knowledge of Linux Experience More ❯
Datadog, Sumologic, NewRelic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of RegEx, Lucene, PromQL Good knowledge of Linux Experience More ❯
Logic, New Relic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL Good knowledge of Linux Experience More ❯
Proven experience in building and scaling observability platforms in a cloud-native environment. Observability Expertise: Deep understanding of observability pillars (metrics, logs, traces) and related tools (e.g., Prometheus, Grafana, OpenTelemetry, Jaeger, Kibana Elastic Stack). AI/ML Proficiency: Hands-on experience integrating ML/AI models into observability systems to drive advanced insights, anomaly detection, and predictive analysis. Distributed More ❯
Sumologic, NewRelic, AppDynamics, Dynatrace, Prometheus,Logz. io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of RegEx, Lucene, PromQL Good knowledge of Linux Experience More ❯
Logic, New Relic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL Good knowledge of Linux Experience More ❯
Bring: Strong hands-on experience with cloud platforms (AWS, GCP, Azure) and DevOps tooling Familiarity with observability stacks like Grafana, Prometheus, Datadog, Splunk, Kibana, etc. Experience with technical integrations (OpenTelemetry, Fluentd, Fluentbit, Filebeat, etc.) Skilled in troubleshooting Kubernetes and containerised environments Strong communication skills — able to engage with technical teams and senior stakeholders Comfortable working in fast-paced environments and More ❯
London, England, United Kingdom Hybrid / WFH Options
9fin
you: Good working knowledge of AWS services including ECS, EC2, Lambda, VPC, IAM, Route53, CloudFront, S3, RDS Good understanding of monitoring and logging solutions, e.g. Prometheus, AWS Cloudwatch, Grafana, OpenTelemetry, Honeycomb, ELK etc. Basic SRE knowledge, and experience in alerting and incident management platforms (eg. Opsgenie, Pagerduty) Proven ability to provide and support strong and scalable CI/CD pipelines More ❯
Linux/Unix systems, SQL, and scripting Strong experience with a programming language such as Python, Java, etc Strong experience with monitoring and observability tools (Prometheus, Grafana, Splunk, Geneos, OpenTelemetry, Corvil) Familiarity with cloud platforms, containerization (e.g., Kubernetes, Docker), and CI (Continuous Integration)/CD (continuous Delivery) pipelines Strong understanding of the trade lifecycle and fundamental trading systems and knowledge More ❯
Linux/Unix systems, SQL, and scripting Strong experience with a programming language such as Python, Java, etc Strong experience with monitoring and observability tools (Prometheus, Grafana, Splunk, Geneos, OpenTelemetry, Corvil) Familiarity with cloud platforms, containerization (e.g., Kubernetes, Docker), and CI (Continuous Integration)/CD (continuous Delivery) pipelines Strong understanding of the trade lifecycle and fundamental trading systems and knowledge More ❯
London, England, United Kingdom Hybrid / WFH Options
0840 Deutsche Bank Aktiengesellschaft, Filiale London
services Strong Linux/Unix, SQL, and scripting skills Experience with programming languages such as Python or Java Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Splunk, Geneos, OpenTelemetry, Corvil) Familiarity with cloud platforms, containerization (Kubernetes, Docker), and CI/CD pipelines Knowledge of trade lifecycle, trading systems, FX products, market structure, or algorithmic trading is advantageous How we More ❯
. Experience orchestrating GPU/AI workloads , MLops, or large‐language‐model serving. Knowledge of edge/IoT deployments and over‐the‐air update strategies. Exposure to observability stacks (OpenTelemetry, Loki) and security tooling (Falco, Aqua, Wiz). What We Offer Base salary £100,000 – £170,000 plus meaningful equity. Gym membership Comprehensive health, dental & vision coverage (UK & global travel More ❯
systems administration combined with strong SQL skills and proficiency in scripting languages such as Python or Java.* Demonstrated experience with monitoring and observability tools including Prometheus, Grafana, Splunk, Geneos, OpenTelemetry or Corvil is highly desirable.* Familiarity with cloud platforms as well as containerisation technologies like Kubernetes or Docker alongside CI/CD pipeline management is important for this role.* Comprehensive More ❯
London, England, United Kingdom Hybrid / WFH Options
Deutsche Bank
Linux/Unix systems, SQL, and scripting Strong experience with a programming language such as Python, Java, etc Strong experience with monitoring and observability tools (Prometheus, Grafana, Splunk, Geneos, OpenTelemetry, Corvil) Familiarity with cloud platforms, containerization (e.g., Kubernetes, Docker), and CI (Continuous Integration)/CD (continuous Delivery) pipelines Strong understanding of the trade lifecycle and fundamental trading systems and knowledge More ❯
CD. Requirements: Expert level scripting/coding skills in one or more languages (Python/Golang etc.). Expert knowledge of observability systems (Prometheus/ELK/Jaeger/Opentelemetry/Service Meshes etc.). Experience with configuration management tools (Ansible/Puppet/Kapitan/Terraform). Experience with distributed data platforms (Kafka/Flink/Airflow). Comfortable More ❯
London, England, United Kingdom Hybrid / WFH Options
Tripadvisor
CI/CD – Jenkins, git, bitbucket, GitLab, liquibase Experience in using SQL/NoSQL data stores – RDS, DynamoDB, ElastiCache, Solr Jira and Agile methodologies Desired Skills & Knowledge Experience with OpenTelemetry Experience of managingKubernetes cluster and containerisation AWS and IaC – Terraform, CloudFormation, VPC, IAM, EC2, EKS, Lambda, RDS, S3, CloudWatch, puppet, docker Experience building and running monitoring infrastructure at a large More ❯
Job Description Be an integral part of a team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products. As a Software Engineer III at JPMorgan Chase within the Global Banking Platform (GBP), you are More ❯
London, England, United Kingdom Hybrid / WFH Options
Stealth AI Startup
This range is provided by Stealth AI Startup. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range Direct message the job poster from Stealth AI Startup Fractional Talent More ❯
Get AI-powered advice on this job and more exclusive features. Job Description Be an integral part of a team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products. As a Software Engineer III More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Stealth AI Startup
Who we are We are a seed-stage AI start-up backed by leading European and US funds. Our founders previously built and deployed cutting-edge AI systems at world-class research labs and high-growth technology companies. We apply More ❯