applications into microservices architectures. In-depth Linux/Unix experience, emphasizing system performance tuning and automation. Familiarity with monitoring, logging, and observability tools (e.g., Prometheus, Grafana, Loki, OTel, ELK stack) to ensure system reliability and performance. Experience in developing and working with backend applications technologies (e.g. Express, Django). Benefits More ❯
GitLab CI/CD, or CircleCI. Strong knowledge of containerization technologies (e.g., Docker, Kubernetes) and microservices architecture. Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Cloudwatch). Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems. Experience of Incident management and blameless More ❯
as Code (IaC) Terraform – for provisioning cloud resources and maintaining infrastructure as code. CloudFormation – if working in AWS environments. Observability & Monitoring Time-Series Databases – Prometheus, InfluxDB, OpenTSDB, or TimescaleDB for metrics collection and storage. Logging & Tracing – Experience with ELK (Elasticsearch, Logstash, Kibana) or Graylog, OpenTelemetry, and Grafana. APM (Application Performance More ❯
in at least one programming language that compiles to machine code such as Rust, C++, or Go. Expert knowledge of monitoring technologies such as Prometheus, Grafana, and PagerDuty. Expert knowledge of deployment technologies such as Pulumi or Terraform. Expert knowledge of Kubernetes. Responsibilities: Improving our observability by adding/adjusting More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
in at least one programming language that compiles to machine code such as Rust, C++, or Go. Expert knowledge of monitoring technologies such as Prometheus, Grafana, and PagerDuty. Expert knowledge of deployment technologies such as Pulumi or Terraform. Expert knowledge of Kubernetes. Responsibilities: Improving our observability by adding/adjusting More ❯
agile methodologies. Good understanding of electronic and systems design. Strong communication and collaboration skills. Desired Skills: Experience with monitoring and logging tools such as Prometheus, Grafana, ELK stack, or Splunk. Designing and implementing graphical user interfaces. Appreciation for physics, particularly radiation-related topics. Education and Experience: Bachelor's degree in More ❯
tasks. Experience with CI/CD tools (GitHub Actions, Jenkins, AWS CodePipeline), and integrating data-centric workflows. Familiarity with monitoring and logging tools (e.g., Prometheus, Loki, Grafana) in application and data-intensive environments. Proficiency in Configuration Management tools (Chef, Puppet, Ansible) and data orchestration tools (e.g., Airflow, Prefect). Strong More ❯
AWS cloud services to implement highly efficient architecture. Ability to analyze infrastructure and implement security best practices. Experience with infrastructure monitoring tools like Nagios, Prometheus, Grafana. Expertise in containerization platforms like Docker and container orchestration platforms like Kubernetes and Rancher. Familiarity with infrastructure as code tools such as Terraform, CloudFormation More ❯
Bash, or PowerShell for automation. Understanding of AWS networking concepts, including VPCs, subnets, and security groups. Experience with monitoring and logging solutions such as Prometheus, Grafana, ELK Stack, or AWS CloudWatch. Familiarity with Zero Trust security models and best practices for securing cloud workloads. Ability to troubleshoot complex infrastructure issues More ❯
london, south east england, united kingdom Hybrid / WFH Options
LHH
Bash, or PowerShell for automation. Understanding of AWS networking concepts, including VPCs, subnets, and security groups. Experience with monitoring and logging solutions such as Prometheus, Grafana, ELK Stack, or AWS CloudWatch. Familiarity with Zero Trust security models and best practices for securing cloud workloads. Ability to troubleshoot complex infrastructure issues More ❯
Python, Bash, or PowerShell for automation. Understanding of AWS networking concepts, including VPCs, subnets, security groups. Experience with monitoring and logging solutions, such as Prometheus, Grafana, ELK Stack, or AWS CloudWatch. Familiarity with Zero Trust security models and best practices for securing cloud workloads. Ability to troubleshoot complex infrastructure issues More ❯
native environments at scale. Exposure to high-load, high-performance systems and large-scale microservices architectures. Experience with observability and monitoring frameworks (OpenTelemetry, Grafana, Prometheus). Knowledge of Graph Databases and AI integration in platform operations is a plus. Experience mentoring junior engineers and leading cross-functional initiatives. Why Join More ❯
Python, Bash, or PowerShell for automation. Understanding of AWS networking concepts, including VPCs, subnets, security groups. Experience with monitoring and logging solutions, such as Prometheus, Grafana, ELK Stack, or AWS CloudWatch. Familiarity with Zero Trust security models and best practices for securing cloud workloads. Ability to troubleshoot complex infrastructure issues More ❯
or GCP. Familiarity with DevOps tools (Terraform, Ansible, Jenkins, Git) and modern CI/CD workflows. Solid foundation in observability and monitoring tools like Prometheus, Grafana, ELK, or Datadog. Experience working with containerized environments (Docker, Kubernetes). Clear communicator and strong collaborator across time zones and teams. You might also More ❯
the ability to architect secure, performant, and highly available cloud solutions. Proficiency with monitoring and log analytics tools such as AWS CloudWatch, ELK Stack, Prometheus, Datadog, or New Relic, to maintain observability and ensure operational excellence. Demonstrated leadership skills in managing complex, high-pressure situations and guiding teams through incident More ❯
own end to end delivery of solutions. Expert knowledge of SRE fundamentals and a commitment to best practice Fluency with common observability tooling like Prometheus, Grafana, OTEL and Cloudwatch Experience analysing and building data telemetry, querying (PromQL), modelling, pipelines and dashboards to provide concise, focused insights and alerts for distributed More ❯
Proficiency in Jenkins, GitHub Actions, GitLab CI, or Azure DevOps. ? Containerization & Orchestration - Experience with Docker and Kubernetes. ? Monitoring & Logging - Knowledge of tools such as Prometheus, Grafana, ELK Stack, or CloudWatch. ? Security & Compliance - Understanding of public sector security frameworks, including ISO 27001, NCSC guidelines, and CIS benchmarks. ? Scripting & Automation - Proficiency in More ❯
networking, such as AWS VPC, Security Groups, Load Balancer, ASG, as well as cloud security, such as IAM and WebACL. Familiarity with ELK, Grafana, Prometheus, or other log-aggregation/monitoring solutions. Experience with Continuous Integration tools such as Team Foundation Server, Azure DevOps, GitLab or Jenkins. Solid Windows, IIS More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
Searchability
CI/CD) and automation tools like Terraform and Ansible Programming : Proficiency in Python, Go, or Ruby Monitoring and Observability : Hands-on experience with Prometheus, Grafana, ELK Stack, or similar technologies Core Attributes A passion for solving complex technical challenges in high-availability production environments Strong communication and collaboration skills More ❯
Greater Bristol Area, United Kingdom Hybrid / WFH Options
Searchability NS&D
GitLab CI/CD) and automation tools like Terraform and Ansible Programming : Proficiency in Python, Go, or Ruby Monitoring & Observability : Hands-on experience with Prometheus, Grafana, ELK Stack, or similar technologies Core Attributes A passion for solving complex technical challenges in high-availability production environments Strong communication and collaboration skills More ❯
experience with Docker (Swarm or Kubernetes) for container orchestration and management. Monitoring and Alerting: Familiarity with monitoring and analytics tools like Grafana, ELK, and Prometheus for system visibility and performance insights. Version Control and Collaboration: Knowledge of Git/GitHub/GitLab for code management, along with a collaborative approach More ❯
etc (AWS VPC a plus). Demonstrable experience with infrastructure-as-code tools such as Terraform. Experience with monitoring and alerting tools such as Prometheus, Grafana, Sentry, Sumologic or AWS cloud native tools. Good interpersonal skills to collaborate with multi-functional teams. Demonstrable experience writing systems automation tooling is a More ❯
as code, like Terraform. Experience with APM tools like App Dynamics, Instana, ability to manage and maintain system configurations. Experience with monitoring tools like Prometheus, Kibana, Grafana. Exposure to various infrastructure technologies such as Couchbase, Elasticsearch, Oracle, MSSQL, PostgreSQL, and Kafka. Exposure to Hyper-V, Openstack, VmWare, RHVE. Experience with More ❯