Milton Keynes, Buckinghamshire, United Kingdom Hybrid / WFH Options
Jefferson Frank
guidelines. Requirements 3+ years experience in an SRE role Strong understanding how important SLOs, SLIs and KPIs are to the systems you support, using observability to be your grounding point on a daily basis Extensive knowledge of all major services in GCP (Cloud Run, BigQuery, GKE etc) Experience in setting more »
guidelines. Requirements 3+ years experience in an SRE role Strong understanding how important SLOs, SLIs and KPIs are to the systems you support, using observability to be your grounding point on a daily basis Extensive knowledge of all major services in GCP (Cloud Run, BigQuery, GKE etc) Experience in setting more »
City of London, London, United Kingdom Hybrid / WFH Options
SKY
Working closely with the Platform Enablement team and the SRE team to drive new ideas into the product roadmaps around the deployment self-service, observability, security, and reliability Team Management Define and track key performance indicators (KPIs) to measure the success and impact of Linux initiatives. Fostering a team culture more »
in the business Key Experience: Deep understanding of SRE ethos and principles Vast amounts of Terraform experience Solid experience with Python Solid experience of Observability tooling. Good experience in dashboard creation/data visualisation using tools such as Google Looker, or Grafana Strong CI/CD experience Strong containerisation experience more »
Middlesex, South East, United Kingdom Hybrid / WFH Options
SKY
Working closely with the Platform Enablement team and the SRE team to drive new ideas into the product roadmaps around the deployment self-service, observability, security, and reliability Team Management Define and track key performance indicators (KPIs) to measure the success and impact of Linux initiatives. Fostering a team culture more »
Reigate, Surrey, South East, United Kingdom Hybrid / WFH Options
Client Server
collaborate across product focussed Agile engineering teams to ensure the reliability, availability and performance of client facing services. Responsibilities will include managing and configuring observability platforms such as DataDog and PagerDuty to provide proactive monitoring of production (and other) environments, design and implementation of automation processes to drive efficiencies, leading … similar SRE/Site Reliability Engineer position You have experience of running 24x7 services in the public cloud - Azure preferred You have experience with observability tools such as DataDog and PagerDuty You have a good knowledge of Containerisation - Kubernetes, AKS You have strong scripting skills for automation, PowerShell or Python more »
serverless function runtime version upgrades) Infrastructure drift monitoring & management Softwaremaintenance (e.g. language/framework/package version upgrades) Key rotation management Tooling maintenance (e.g. observability stack) Performance and Load testing Role Requirements Strong knowledge of Microsoft technologies. Active Directory, Entra, SharePoint, 365, Windows 10/11, Intune, Application packaging (Intune more »
Letchworth Garden City, Hertfordshire, United Kingdom Hybrid / WFH Options
Jefferson Frank
guidelines. Requirements 3+ years experience in an SRE role Strong understanding how important SLOs, SLIs and KPIs are to the systems you support, using observability to be your grounding point on a daily basis Extensive knowledge of all major services in GCP (Cloud Run, BigQuery, GKE etc) Experience in setting more »
Cheltenham, England, United Kingdom Hybrid / WFH Options
Northrop Grumman
the support for live (mission critical) systems, working with customers to fault find and resolve issues within strict time constraints. Experience using Industry standard observability tooling (ELK, Grafana, Prometheus), creating/maintaining these environments is a plus. You will have a strong understanding & navigation of both Windows and Linux operating more »
City of London, London, United Kingdom Hybrid / WFH Options
83zero Limited
the system. Consulting/Coaching experience in implementing new ways of working and enabling agile delivery transformation. Enabling continuous delivery while ensuring reliability, quality, observability, and performance Understanding of build and deployment pipelines, test driven development, automated testing, Test data management, automated Environment provisioning, Version control, Monitoring and alerting and … DevOps enablers. Understanding of observability and monitoring platform; Experience of having collaborated with developers to implement and improve observability and monitoring practices is preferred. Experience in leveraging DORA framework to effectively improve the performance of DevOps teams - Desired. Experience in defining OKRs/KPIs, setting up process/systems to more »
Manchester, North West, United Kingdom Hybrid / WFH Options
83zero Limited
the system. Consulting/Coaching experience in implementing new ways of working and enabling agile delivery transformation. Enabling continuous delivery while ensuring reliability, quality, observability, and performance Understanding of build and deployment pipelines, test driven development, automated testing, Test data management, automated Environment provisioning, Version control, Monitoring and alerting and … DevOps enablers. Understanding of observability and monitoring platform; Experience of having collaborated with developers to implement and improve observability and monitoring practices is preferred. Experience in leveraging DORA framework to effectively improve the performance of DevOps teams - Desired. Experience in defining OKRs/KPIs, setting up process/systems to more »
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Confidential
skills, and the capacity to work well in a team environment are often considered essential. Understanding of DevOps principles and practices, including monitoring, logging, observability and infrastructure management. Why Join Clarus Software? Here are some of the reasons you will love working here: A team of amazing people. We are more »
understanding of Google Cloud (GCP) Deep understanding of SRE ethos and principles Vast amounts of Terraform experience Solid experience with Python Solid experience of Observability tooling. Good experience in dashboard creation/data visualisation using tools such as Google Looker, or Grafana Strong CI/CD experience Strong containerisation experience more »
Feltham, Greater London, United Kingdom Hybrid / WFH Options
Avanti Recruitment
automate the provisioning of infrastructure using Terraform. You will collaborate with the Software Development team to set standards and best practice, design and develop observability tools, monitor and optimise the performance of applications running on the platform. To be considered for interview you are likely to have 3+ years of … following: Azure Cloud platforms - Experience of deploying, monitoring, scaling and maintaining cloud infrastructure Containerisation with Kubernetes orchestration CI/CD pipelines Configuration management - Ansible Observability, Monitoring and alerting tools Being able to work independently and as part of a small team with great communication skills. APPLY TODAY FOR IMMEDIATE CONSIDERATION more »
Employment Type: Permanent
Salary: £75000/annum + 12% pension + healthcare +25 days
stack development web and mobile front-end technologies such as React, Java, APIs & microservices, PostgreSQL, data structures, workflow Knowledge of Site Reliability Engineering, automation, observability, incident management, resilience, disaster recovery, high availability, documentation IAM engineering experience, authentication, authorisation, single sign-on, multi-factor authentication, user lifecycle management, hands on CI more »
GCP) and containerization technologies (e.g., Docker, Kubernetes).Experience with configuration management tools (e.g., Ansible, Terraform) and CI/CD pipelines. Knowledge of monitoring and observability tools (e.g., Prometheus, Grafana, ELK Stack).Scripting and automation skills (e.g., Python, Bash).Excellent problem-solving and troubleshooting skills. Strong communication and collaboration skills. #J more »
with a cloud provider (AWS/Azure/GCE), or sysadmin/SRE experience in data centers Experience designing, building, and operating high-scale observability or infrastructure systems Working knowledge of networking fundamentals, experience with CNIs or cloud networking infrastructure preferred What We Require 4+ years of professional software development more »
for all of Canonical s core services, networks, and infrastructure Develop skills in troubleshooting, capacity planning, and performance investigation, Setting up, maintaining and using observability tools such as Prometheus, Grafana, and Elasticsearch; design, implement and maintain monitoring and alerting for various systems and services Provide assistance and work with globally more »
of experience in industry. Strong working knowledge of Golang. Experience with Kubernetes and the ecosystem of Cloud Native tools. Experience with building infrastructure with observability as a first class concept. Bonus skills Contributions to open source projects Experience using machine learning tools in production. A broad understanding of data science more »
management, and the prowess of cloud-native solutions. In your pursuit of continuous improvement, you're not solely reliant on metrics; you dive into observability metrics and user feedback, steering our technical progress with insightful analysis. Staying ahead is not just a practice; it's inherent. You're not merely more »
with the organisation's objectives and technology roadmap. Design technical solutions that is going to meet all client requirements across different metrics (performance, security, observability) Technology Selection: Evaluate and recommend integration technologies, middleware, and tools to support integration initiatives, taking into account scalability, performance, security, and cost-effectiveness. Collaboration: Working more »
models to production. Optimizing the platform runtime for maximum performance. This is largely C++ code with parts of the pipeline running on GPU. Building observability and telemetry. Requirements and experience we are looking for 3+ years of experience writing production software in C++ and Python of experience building applications processing more »
both to your teammates. You re interested in reliability engineering concepts such as the different types of testing, progressive deployments, error budgets, the role observability, and fault-tolerant design. You re excited about being part of a team of diverse perspectives and backgrounds that believe in tackling challenges, growing hand more »
Stanmore, England, United Kingdom Hybrid / WFH Options
Sky
team, and provide mentoring to individual team members. • Devise measures for application performance, build systems to apply them to ensure a high level of observability across the estate, and consult with individual teams to help them achieve their performance goals. What you'll bring • Extensive and varied experience of software more »
on our Ruby on Rails monolith, building data models, APIs, and business logic services. Delivering your work using agile methodologies and tools like tests, observability, AB-tests, and feature flags. Analyzing data to identify problems and generate new ideas, using various sources such as our database, application logs, and user more »