team that provides operational support for Linux servers, networks, and AWS cloud infrastructure. Manage security vulnerabilities and implement mitigations. Implement and maintain monitoring and observability solutions. Provision infrastructure for new projects and products. Support project delivery and provide infrastructure design expertise. Maintain and improve configuration management (Puppet) and DevOps processes. More ❯
automate infrastructure provisioning and management. Establish and maintain robust security controls across all cloud environments, ensuring compliance with relevant standards and regulations. Utilise advanced observability tools to monitor and optimise the performance of production services, proactively identifying and resolving issues. Design and optimise CI/CD pipelines using platforms such More ❯
for consumption in reporting, analytics and science. Optimise data pipelines and queries for better performance and cost-efficiency. Integrate data pipelines with monitoring and observability to proactively detect and resolve issues before they impact business operations. Design and build data models for lake house storage and analytics. Implement and maintain More ❯
year) The Role Lead a small engineering team while staying deeply hands-on Design and optimise high-volume, distributed backend services Drive quality engineering, observability, and fault-tolerance Collaborate closely with PMs and senior engineers on roadmap delivery Shape the platform’s technical evolution through major architectural changes Champion modern More ❯
with Event-Driven Architecture using AWS services (SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. More ❯
Accreditation Council for Graduate Medical Education
collaboration such as GitHub, ArgoCD, or similar. Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, tests, and releases. Experience using observability tools such as APM, logging, and metrics to assist with debugging issues. Experience using Infrastructure as Code tools for provisioning infrastructure such as Terraform, Cloudformation More ❯
configuration management. Experience with cloud infrastructure and managing tools for cloud services (e.g., AWS Lambda, EC2, Kubernetes). Hands-on experience with monitoring and observability tools like Prometheus, Grafana, or ELK Stack for tracking tool performance. Knowledge of security best practices in managing tool configurations, particularly for CI/CD More ❯
Kubernetes (CKE) and Terraform Familiar with databases (SQL or NoSQL). Experience with client/server software architectures & networking, or microservice architectures. Experience with observability tools like Grafana, Prometheus, Open Telemetry and others. Experience with streaming architectures and tools (e.g. Kafka More ❯
governance standards across all data engineering processes. ? Performance Optimisation: Monitor and optimise the performance of AI models and data pipelines, implementing best practices in observability and monitoring. ? Cross-Functional Collaboration: Work closely with product managers, software engineers, and other stakeholders to understand requirements and deliver technical solutions that align with More ❯
Newcastle Upon Tyne, Tyne And Wear, United Kingdom
Sage City
workflows to standardise best practices across teams. Designing and implementing modern CI/CD pipelines using tools like GitHub Actions, CircleCI, or Buildkite. Enhancing observability as code - Define dashboards, alerts, and monitoring configurations (New Relic preferred). Building automation tools, APIs, and internal services to reduce toil and improve developer More ❯
MySQL, Postgres, Redis, etc.) • Experience with DevOps engineering and working with container orchestration, such as with Docker or Kubernetes • Experience with log monitoring and observability via platforms like Sumologic or Cloudwatch • Experience automating infrastructure, testing, and deployments using tools like CircleCI Configuration management tooling and infrastructure as code knowledge is More ❯
in a fast-paced environment. Bonus skills and experience AWS Solution Architect certification. Experience with Docker, Kubernetes, or other container orchestration tools. Familiarity with observability tools (e.g., New Relic) for tracking usage and service health. Experience with Kafka, Flink, or IoT streaming technologies. Background in financial services or other regulated More ❯
GCP & Azure) Solid understanding of high-quality coding, testing, and development practices Preferable skills: Infrastructure as code (Eg Terraform/Cloudformation/Pulumi) Monitoring & Observability What you can expect from us We won't just meet your expectations. We'll defy them. So you'll enjoy the comprehensive rewards package More ❯
Terraform, Kubernetes, Docker, and AWS services. Experience in designing and implementing scalable internal developer platforms (IDPs). Strong knowledge of CI/CD pipelines, observability, and platform automation. If you're ready to redefine software engineering at scale, drive platform excellence, and work hands-on with cutting-edge technologies, we More ❯
such as Terraform, Kubernetes, Docker, and AWS. Experience designing and implementing scalable Internal Developer Platforms (IDPs). Strong knowledge of CI/CD pipelines, observability, and platform automation. If you're ready to shape the future of software engineering at scale, drive platform excellence, and work at the heart of More ❯
of trading processes, including FIX connectivity, order management, pricing, and market making. Familiarity with ITIL framework processes for incident and problem management. Knowledge of observability and monitoring tools (e.g., ELK, Grafana). Understanding of object-oriented programming languages such as C# .NET and Python is a plus. Why join us More ❯
Hart, Yorkshire, United Kingdom Hybrid / WFH Options
RVU Co UK
CI/CD). Make data guided decisions that impact core business metrics and processes. Solid understanding of platform and reliability engineering approaches, including observability, performance optimisation, capturing analytics and security best practices. Drive the adoption of new technologies like Go and Python. Facilitate collaboration between teams and build a More ❯
and exchange gateways. What will you do? Close collaboration with our datacentre infrastructure and network engineers and will also overlap with the monitoring and observability domain. Working in a DevOps approach that involves concepts of code review, automated testing and release pipelines. Work in an environment that is protected by More ❯
within anAgileframework to support the development and operations of web applications. Desirable Skills: Serverless & Microservices: Experience withAWS Lambda,Azure Functions, and event-driven architectures. Observability & Monitoring: Familiarity with monitoring tools likeSplunk,Datadog, orNew Relicfor enhanced visibility and observability. Networking: Knowledge ofVPCs,VPNs, andload balancingin cloud environments. GDS Standards: Awareness ofGDS More ❯
be able to build new DevOps pipelines AWS S3 RDS Route 53 IAM EKS Secrets Manager ECR Terraform Deployment of AWS Resources Pipelines OCI Observability ELK Dynatrace Prometheus Others Vault RedHat As an equal opportunities employer, we welcome applications from individuals of all backgrounds. However, for you to be eligible More ❯
be able to build new DevOps pipelines: AWS S3, RDS, Route 53, IAM, EKS, Secrets Manager, ECR Terraform Deployment of AWS Resources, Pipelines, OCI Observability: ELK, Dynatrace, Prometheus Others: Vault, RedHat As an equal opportunities employer, we welcome applications from individuals of all backgrounds. However, for you to be eligible More ❯
be able to build new DevOps pipelines AWS S3 RDS Route 53 IAM EKS Secrets Manager ECR Terraform Deployment of AWS Resources Pipelines OCI Observability ELK Dynatrace Prometheus Others Vault RedHat As an equal opportunities' employer, we welcome applications from individuals of all backgrounds. However, for you to be eligible More ❯
Automate the provisioning, scaling, and monitoring of cloud resources within Azure, implementing advanced automation and orchestration strategies to increase operational efficiency. Implement and optimise observability, logging, and alerting with tools such as Prometheus, Grafana, and OpenTelemetry. Collaborate with the application development teams to enable and facilitate best practices for containerisation … Docker, and Kubernetes (AKS), with experience in deploying, scaling, and managing containerized workloads in production environments and their supporting infrastructure layers. Experience implementing monitoring, observability, traceability, and logging stacks, including tools such as Prometheus, New Relic, Grafana, Loki, and OpenTelemetry. Strong knowledge of Cloud Native Technologies and CNCF-recommended solutions More ❯
is an exciting opportunity to join our growing Operations team managing Kubernetes clusters in Production and, through a DevOps culture, empower development teams with observability insights they can use to innovate faster. We are looking for a Site Reliability Engineer, or production experienced DevOps Engineer, who has working experience building … observability for cloud native SaaS products and driving operational excellence. You will be responsible for delivering our monitoring infrastructure, shaping observability, and responding to incidents as well as ensuring the platform is performant and reliable. You will be a key member of the team, liaising with product teams, embedding SRE … principles and building the observability platform for the next stage of growth at GSS. You will have direct input into the direction of Technical Operations, solving problems, supporting developers and optimising the platform through code. Plus, enjoy a collaborative, flexible, and innovative work culture where your ideas are valued. What More ❯
in the UK You will ideally have: Experience in Budling AWS Native Experience with Kubernetes, Infrastructure Experience with Docker Experience with Containerization Familiar with Observability stacks, i.e. ELK, LGTM. Proficient with IaC tools (Terraform), understanding general use-cases Proficient with at least one scripting language: Python, Ruby, JavaScript, etc. Desirable More ❯