AWS DevOps / Site Reliability Engineer (SRE)
Join Atlantis Technology Solutions as an experienced AWS DevOps / Site Reliability Engineer (SRE) to design, operate, and continuously improve highly available cloud infrastructure. As a senior technical authority, you will serve as the final escalation point for complex incidents and drive automation and operational excellence across the platform.
Key Highlights
Technical Skills Required
Benefits & Perks
Job Description
Company Description
Atlantis Technology Solutions is a UK-based technology consulting and services firm delivering high-impact digital solutions across web, mobile, cloud, and enterprise platforms. Headquartered in London, we partner with global clients to design, build, and operate scalable, secure, and high-performance systems. Our teams are driven by engineering excellence, automation-first thinking, and a strong focus on measurable business outcomes.
Role Description
We are seeking an experienced AWS DevOps / Site Reliability Engineer (SRE) to design, operate, and continuously improve highly available cloud infrastructure. In this role, you will act as a senior technical authority for reliability and scalability, serving as the final escalation point for complex incidents while driving automation and operational excellence across the platform.
You will work closely with development, platform, and operations teams to embed SRE best practices throughout the software lifecycle and ensure resilient, zero-downtime systems.
Key Responsibilities
- Serve as the final escalation point for complex production incidents and system failures
- Design and develop advanced automation for deployment, monitoring, scaling, and recovery
- Build, maintain, and optimize CI/CD pipelines enabling zero-downtime deployments
- Implement and manage Infrastructure as Code (IaC) using Terraform and Ansible
- Perform capacity planning, performance tuning, and reliability engineering
- Monitor systems using metrics and logs to proactively identify and resolve bottlenecks
- Collaborate with engineering teams to embed SRE principles into application design and delivery
- Ensure high availability, security, and operational excellence across AWS environments
Required Skills & Experience
- 5+ years of hands-on experience in DevOps and Site Reliability Engineering
- Strong expertise with AWS cloud services
- Deep experience with Kubernetes and containerized workloads
- Proficiency in Infrastructure as Code (Terraform, Ansible)
- Strong experience with CI/CD tools such as Jenkins, AWS CodeDeploy, GitLab CI, and ArgoCD
- Hands-on knowledge of monitoring and observability tools (Prometheus, Grafana)
- Solid understanding of databases (SQL/NoSQL) and caching strategies
- Strong knowledge of networking, security, and Linux/Windows systems
- Excellent problem-solving, communication, and stakeholder-management skills
- Ability to work effectively in cross-functional, distributed teams
Nice to Have
- AWS Certified DevOps Engineer or equivalent cloud certifications
- Experience operating systems at scale in high-availability environments
Why Join Atlantis Technology Solutions?
- Work remotely with globally distributed teams
- Own critical reliability and platform decisions
- Operate at scale using modern cloud-native tooling
- Be part of a company that values engineering rigor, automation, and continuous improvement
Note: No visa sponsorship is provided for this role.
Similar Jobs
Explore other opportunities that match your interests
adlib recruitment | b corpâ„¢
company watch
Senior FinOps Engineer (Azure Cloud Cost Optimization)