excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
cloud systems while keeping levels of manual work low. SREs are expected to be experienced in software engineering principles, operational discipline, and automation. The SRE team work on a fully remote basis and work in conjunction with their US and Australian teams as well. This company are a market leader … performance. Collaborate with product engineering teams to design/build fit-for-purpose and observable software. Required Skills and Experience: Proven experience in a SRE/DevOps/Platform Engineering role and having previously worked in a Software Engineering role in .Net and C#. Proficiency in C# development language - alongside … development opportunities. Working with a team of caring, high-performing, and passionate people who have fun supporting our vision, innovation, and continuous improvement. This SRE/DevOps Engineer role is working for a market leading global software company and this job is part of a large program of change More ❯
Social network you want to login/join with: Principal Engineer (SiteReliability/SRE), London Client: Genomics England Location: London, United Kingdom Job Category: - EU work permit required: Yes Job Reference: 655a6b409afd Job Views: 3 Posted: 18.04.2025 Expiry Date: 02.06.2025 Job Description: Are you passionate about … reliable services? At Genomics England we are looking for a Principal SiteReliabilityEngineer to help lead and refocus our small SRE capability, ultimately growing it to become thought-leaders in system reliability across the organisation. About the Role As the Principal SiteReliability … and you may have worked in a variety of organisational contexts. Whatever your past experience, it will have given you a deep understanding of SRE principles and practices and how these are used to build and operate reliable services that exceed customer expectations. You will be a problem-solver who More ❯
About this Opportunity Great opportunity for a Senior SiteReliabilityEngineer to join our Financial Wellbeing Platform. As a Senior SRE you'll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … engineers What You'll Need Strong understanding of SiteReliability Engineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams Strong DevOps understanding, including experience of Infrastructure as Code and … CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP Cloud SRE experience and broad set of relevant product knowledge Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical problems More ❯
purpose. About This Opportunity Great opportunity for a Senior SiteReliabilityEngineer to join our Financial Wellbeing Platform. As a Senior SRE you’ll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … engineers. What You’ll Need Strong understanding of SiteReliability Engineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams. Strong DevOps understanding, including experience of Infrastructure as Code and … CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP. Cloud SRE experience and broad set of relevant product knowledge. Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical problems More ❯
reliable services? At Genomics England we are looking for a Principal SiteReliabilityEngineer to help lead and refocus our small SRE capability, ultimately growing it to become thought-leaders in system reliability across the organisation. About the Role As the Principal SiteReliability … and you may have worked in a variety of organisational contexts. Whatever your past experience, it will have given you a deep understanding of SRE principles and practices and how these are used to build and operate reliable services that exceed customer expectations. You will be a problem-solver who … comfortable not just leading your own team, but also engaging across the engineering community and with non-technical stakeholders. About the Tech Stack The SRE team will support squads that run a variety of services: most of these are either user-facing web applications (React), backend APIs (Python), bioinformatics pipelines More ❯
truly innovative and impactful. Learn more about Stacklok’s mission, virtues, and leadership, HERE . Location This is a hybrid role that requires on-site work at our London office three (3) days a week. Our office is conveniently located in WeWork at 1 Mark Square, London, EC2A 4EG. … manage and maintain a robust security posture across the entire software supply chain. We are seeking a Senior SiteReliabilityEngineer (SRE) to support Stacklok Insight, our package intelligence service that empowers developers to make safer open source dependency choices. Embedded within the Stacklok Insight product team … exceptional service performance and reliability. In addition, this role will be part of a company-wide guild dedicated to unifying platform automation, observability, and reliability practices across all product lines, building a cohesive, high-performance SaaS platform with seamless observability and reliability throughout the Stacklok ecosystem. If siteMore ❯
businesses to master market measurement, understand consumer behavior, and drive innovation. Job Description Key Responsibilities: Provide senior-level leadership and technical guidance to the SiteReliability Engineering team. Develop and execute a comprehensive technical strategy for ensuring high availability, scalability, and performance across GfK’s platforms, with a … and CI/CD pipelines (GitLab). Be the go-to person for resolving complex technical challenges and providing innovative solutions to ensure system reliability and performance. Team Leadership: Oversee and mentor team members, fostering a culture of continuous learning and improvement. Guide the implementation of best practices in … sitereliability engineering, ensuring that the team stays at the forefront of technological advancements. Collaboration: Work closely with software engineering teams to design robust architectures, ensuring that new features and updates are delivered reliably and efficiently. Collaborate with external partners and vendors to optimize cloud services and infrastructure More ❯
your best work. Learn more at iongroup.com . We are looking for experienced people who are competent in the cloud and knowledgeable about the SRE (sitereliability engineering) domain. The team: The Core Architecture Team (CAT) produces and manages the core technology, methodologies, and frameworks that underpin all … discussing solutions, problems, and improvements within your team and others in the engineering organization. You have a passion for sitereliability engineering (SRE) principles and adoption, and you are keen to start conversations with teams about reliability, performance, and security of the applications, services, and systems. You … are an advocate of the DevOps or SRE approach, promoting loosely coupled, heavily automated, constantly monitored distributed systems, and you always plan for failure and never take anything for granted. You are keen to raise the bar of the solutions provided by the whole engineering team (dev and ops). More ❯
Social network you want to login/join with: SiteReliabilityEngineer (Observability) London- Hybrid/3 Days Contract Inside IR35- 6 Months initially We’re looking for a SiteReliabilityEngineer (SRE) to join our client to build and maintain observability systems and to ensure their core services remain reliable, scalable, and high-performing. Responsibilities: Deploy and … incident response. Build Grafana dashboards for system insights. Apply Infrastructure as Code (IaC) principles. Develop tooling in Golang (preferred) or Python . Advocate for SRE principles like SLOs, SLIs, and error budgets. Integrate monitoring with incident management workflows. Requirements: SRE principles and reliability engineering expertise. Solid familiarity with Linux More ❯
is used in context with load balancing, in order to optimize user experience. 1 day CookieNameProviderPurposeMaximum Storage DurationTypeNameProviderPurposeMaximum Storage DurationType London - United Kingdom **Senior SiteReliabilityEngineer | UK****Overview****Job description**IT technology lies at the very core of everything we do and our Engineering and Product … so we’ll go the extra mile to help you when we can. We are seeking an experienced SiteReliabilityEngineer (SRE) to join our Infrastructure team. As an SRE, your primary responsibility will be to ensure the reliability, availability, and performance of our technology platforms. … of best security practices, participating in vulnerability assessments, and threat mitigation.Requirements: - Deep understanding and experience in SiteReliability Engineering and in implementing SRE Practices- Excellent knowledge of AWS services and hands-on experience in production environments- Proficiency with networking protocols, DNS principles, and container orchestration technologies (Kubernetes, Helm More ❯
The SiteReliabilityEngineer (SRE) will play a key role in maintaining and scaling infrastructure, ensuring reliability, performance, and scalability. You will collaborate closely with development, operations, and security teams to improve the reliability and efficiency of applications, addressing incidents, automating processes, and managing infrastructure … GCP, or Azure), automate provisioning using Terraform or CloudFormation, and manage resources for optimal performance. Monitor, troubleshoot, and resolve incidents, optimizing systems to ensure reliability and minimize downtime. Implement monitoring (Prometheus, Grafana, Datadog) and set up alerting systems to proactively address issues and ensure scalability. Work with DevOps, engineering … AWS Certified DevOps Engineer, Google Professional Cloud Architect, or similar. Containerization & Orchestration : Experience with Docker, Kubernetes, or ECS/EKS for containerized applications. SRE Experience : Familiarity with SRE principles like SLAs, SLOs, and error budgets, and practical application of those in large-scale systems. Distributed Systems : Understanding of microservices More ❯
up, we want like-minded humans to join us on this exciting journey. Are you ready? As a SiteReliabilityEngineer (SRE), you will play an important role in designing, building, and maintaining the infrastructure and tools necessary to support our software applications and services. You will … collaborate closely with the product engineering squads, technical operations, and security teams to ensure the reliability, scalability, and security of our platform. Your responsibilities will include automating infrastructure provisioning, configuration management, and deployment pipelines, utilizing best practices and modern technologies to streamline processes and improve efficiency. You will also … be responsible for monitoring system performance, identifying bottlenecks, and implementing solutions to enhance system reliability and performance. Key Responsibilities Cloud Platform Management: Using Azure/AWS to manage and optimize infrastructure components, ensuring scalability, reliability, and cost management. Infrastructure Design and Implementation: Designing, building and maintaining the cloud More ❯
improvements to application performance and stability. Collaborate with the design and implementation of the desired pipelines and process for deployment to production environment. The SRE will work closely with Platform and Software domains to ensure continuous improvement of performance and stability whilst adhering to standards. Undertake ad-hoc projects and … other activities as required. Key Accountabilities and Activities Contribute to the SRE function including: Drive evolution of the DevOps/GitOps toolchain, promoting improvements to streamline the software delivery process and showing improvements through metrics. Accountable for halting or stopping a project/product if the solution is not technically … playbooks. Integration with Domains including: Collaborating with Domains to plan, design, test and maintain the application. Design patterns for any component or structure under SRE responsibility. Implementation of components such as Monitoring and Logging. Manage the runbook preparations of Domains. Liaise and support other teams on work items including: Developing More ❯
Role: SiteReliabilityEngineer Location: London (Hybrid) Salary: £80,000 - £105,000 As our SiteReliabilityEngineer, you'll work closely with our feature team and other colleagues to meet defined service level objectives and continually improve systems and environments. You'll define error … Candidate: Very strong engineering skills in Java,JavaScript or Python Open Telemetry experience Must have Core Java/Python Must have experience as an SRE knowledge of Python Data Structures Strong knowledge of deploy and release services, automation and troubleshooting Experience of utilising tools and technology across the software development More ❯
purpose. About this opportunity Great opportunity for a Senior SiteReliabilityEngineer to join our Financial Wellbeing Platform. As a Senior SRE you'll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … engineers What you'll need Strong understanding of SiteReliability Engineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams Strong DevOps understanding, including experience of Infrastructure as Code and … CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP Cloud SRE experience and broad set of relevant product knowledge Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical problems More ❯
SiteReliabilityEngineer - GoLang Specialist 5 days ago Be among the first 25 applicants About: Step forward into the future of technology with ZILO. We're here to redefine what's possible in technology. While we're trusted by the global Transfer Agency sector, our technology is … flexible and designed to transform any business at scale. We've created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can't match. At ZILO, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore … creates real impact. If you're ready to shape the future, let's talk. Job Description: As a SiteReliabilityEngineer (SRE) Developer at ZILO Technologies, you will play a crucial role in maintaining and enhancing the reliability, performance, and scalability of our platform. You will More ❯
SiteReliabilityEngineer - .NET Specialist About: Step forward into the future of technology with ZILO. We're here to redefine what's possible in technology. While we're trusted by the global Transfer Agency sector, our technology is truly flexible and designed to transform any business at … scale. We've created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can't match. At ZILO, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious mind, and set … creates real impact. If you're ready to shape the future, let's talk. Job Description: As a SiteReliabilityEngineer (SRE) Developer at ZILO Technologies, you will play a crucial role in maintaining and enhancing the reliability, performance, and scalability of our platform. You will More ❯
Join us as a SiteReliabilityEngineer In this key role, you’ll improve, drive, and embed non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and services You’ll enjoy significant stakeholder interaction … chance to join an inclusive team with a collaborative ethos and a commitment to innovation and professional development What you'll do As our SiteReliabilityEngineer, you’ll work closely with our feature team and other colleagues to meet defined service level objectives and continually improve … system and environment reliability. You’ll define SLOs, SLIs and error budgets that support finding the right balance between risk reliability and continuous improvement. You’ll also provide structure and help to our release process, suggesting and making improvements where possible. You’ll scale systems sustainably through mechanisms like More ❯
Security cleared SiteReliabilityEngineer (SRE) Global Consultancy Up to £55k + benefits Hybrid remote, based in Manchester, London, or Glasgow We are seeking a talented Security cleared SiteReliabilityEngineer to join our dynamic team as we embark on an exciting journey of … please only apply if you're comfortable with this responsibility. What We’re Looking For: Holding live security clearance (SC) Proven experience in implementing SRE principles and elevating system reliability. Deep expertise in microservices architecture and container orchestration. Hands-on experience building and enabling continuous delivery pipelines, from build to … functional teams, from developers to stakeholders. Familiarity with industry-leading tools like Dynatrace, Prometheus, and Open Telemetry is a huge plus! Key Responsibilities: Implement SRE principles, with a focus on tools such as Dynatrace, Prometheus, and Open Telemetry. Develop real-time dashboards and configure alerts to provide in-depth visibility More ❯
Job Description SiteReliabilityEngineer Hybrid Working - London - 1 day a week on site. Financial Services Lorien's leading banking client is looking for a SiteReliabilityEngineer to join the existing team on a brand-new project. This role is based in London … and will be via Umbrella. Main skills needed: Experienced working as a Senior ReliabilityEngineer Azure Monitoring Centinal and Telemitry experience Experienced with Application monitoring - Splunk Experienced with Incident Management - ServiceNow Connection to Mongo DB Experience of moving/working with AWS Support for tools. Carbon60, Lorien & SRG More ❯
purpose. About this opportunity Great opportunity for a Senior SiteReliabilityEngineer to join our Financial Wellbeing Platform. As a Senior SRE you’ll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … engineers What you’ll need Strong understanding of SiteReliability Engineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams Strong DevOps understanding, including experience of Infrastructure as Code and … CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP Cloud SRE experience and broad set of relevant product knowledge Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical problems More ❯
SiteReliabilityEngineer page is loaded SiteReliabilityEngineer Apply remote type Remote locations London, UK Manchester, UK Remote (United Kingdom) Belfast, UK time type Full time posted on Posted 27 Days Ago job requisition id R The Intapp Cloud Platform is a rapidly … You will work with Development and Product Management to design and deliver new functionality. You will perform deep dives into both systemic and latent reliability issues; partner with software engineers across the organization to produce and roll out fixes. You will drive standardization efforts across multiple disciplines and services … a solid understanding of continuous integration, deployment and operations concepts. You have production experience of managing Windows Infrastructure running IIS workloads. Passion for resolving reliability issues and identify strategies to mitigate going forward. Automation mindset - if you can automate it, do it. Fluency in English. What you'll gain More ❯
Social network you want to login/join with: SiteReliabilityEngineer (SRE) - ASE/Provisioning, London Client: Apple Location: London, United Kingdom Job Category: - EU work permit required: Yes Job Reference: 2ad652bcaa4d Job Views: 3 Posted: 18.04.2025 Expiry Date: 02.06.2025 Job Description: Summary: People at Apple … Imagine what you could do here! Join Apple, and help us leave the world better than we found it. The Apple Service Engineering - Provisioning SRE team is looking for SiteReliability Engineers to build and run the services that hundreds of millions of customers use every day. This More ❯