Site Reliability Engineer

Visa Sponsorship Relocation
Apply
AI Summary

Ensure system stability, performance, and scalability as a Site Reliability Engineer. Design and operate monitoring systems, automate deployments, and lead incident response. Collaborate with cross-functional teams to deliver high-quality products.

Key Highlights
Design and operate monitoring, alerting, and incident response systems
Automate deployments, scaling, and operational workflows
Collaborate with architects, developers, and product teams
Key Responsibilities
Design and operate monitoring, alerting, and incident response systems to ensure high availability
Define and manage SLIs, SLOs, and SLAs; proactively mitigate reliability, performance, and capacity risks
Automate deployments, scaling, and operational workflows; implement infrastructure as code and self-healing patterns
Optimize CI/CD pipelines for faster, safer, and more reliable releases
Lead or support incident response, root cause analysis, and post-mortems; translate findings into preventive measures
Collaborate with architects, developers, and product teams to ensure scalable, reliable system design
Review system changes for operational, performance, and reliability impact
Support capacity planning, performance benchmarking, and scaling strategies
Contribute to security monitoring and ensure secure system operations
Drive continuous improvement in observability, reliability, and operational efficiency
Technical Skills Required
Cloud platforms (AWS, GCP, Azure) Kubernetes Observability tools (Prometheus, Grafana, ELK stack) Scripting and automation (Python, Bash) CI/CD pipelines Infrastructure as code
Benefits & Perks
Competitive salaries
EGYM Wellpass
Language courses
Jobrad
Flexible working hours
Relocation and visa support

Job Description


About us

At arculus, we design, build, and maintain cutting-edge autonomous mobile robots and the software ecosystem around them. Our Development department brings together software, infrastructure, and product experts in a collaborative, international environment, focused on delivering reliable and high-quality products that make a real difference in intralogistics.


Your Role

As a Site Reliability Engineer, you will be responsible for ensuring the stability, performance, and scalability of our Automation Software platform. Your mission begins with a strong focus on the "Now": building robust monitoring, automation, and operational practices that keep our systems reliable under real-world conditions.

Operating at the intersection of software development and operations, you will proactively prevent incidents, optimize system behavior, and enable fast, reliable service delivery. By aligning reliability engineering with product and architectural goals, you will ensure our systems meet critical KPIs such as uptime, latency, and deployment velocity across the entire lifecycle.


Your Tasks & Responsibilities

  • Design and operate monitoring, alerting, and incident response systems to ensure high availability
  • Define and manage SLIs, SLOs, and SLAs; proactively mitigate reliability, performance, and capacity risks
  • Automate deployments, scaling, and operational workflows; implement infrastructure as code and self-healing patterns
  • Optimize CI/CD pipelines for faster, safer, and more reliable releases
  • Lead or support incident response, root cause analysis, and post-mortems; translate findings into preventive measures
  • Collaborate with architects, developers, and product teams to ensure scalable, reliable system design
  • Review system changes for operational, performance, and reliability impact
  • Support capacity planning, performance benchmarking, and scaling strategies
  • Contribute to security monitoring and ensure secure system operations
  • Drive continuous improvement in observability, reliability, and operational efficiency


Your Experience

  • 3+ years in Site Reliability Engineering, DevOps, or similar roles in production environments
  • Proven experience improving system reliability, reducing downtime, and enhancing deployment processes
  • Strong expertise in cloud platforms (AWS, GCP, Azure) and Kubernetes
  • Hands-on experience with observability tools (Prometheus, Grafana, ELK stack)
  • Solid scripting and automation skills (e.g., Python, Bash)
  • Experience operating and scaling distributed systems in large production environments
  • Familiarity with CI/CD pipelines, infrastructure as code, and modern DevOps practices


Who You Are

  • Passionate about building reliable, scalable, and observable systems
  • Strong communicator, able to collaborate effectively across engineering, product, and operations teams
  • Proactive and solution-oriented, with a strong sense of ownership and accountability
  • Analytical and structured thinker with a focus on continuous improvement
  • Comfortable working in fast-paced, complex environments with evolving system landscapes
  • Motivated to ensure technical excellence translates into stable and high-performing real-world systems


WHY ARCULUS

  • We are a diverse, global team of 100+ creative thinkers, algorithmic brains, makers, movers, and shakers.
  • Our approach comes from a continuous cycle: assemble, weld, code, test, deploy or delete, and repeat. That is how we deliver innovative solutions to tackle the biggest intralogistics challenges.
  • You will find our tech space nestled within the eastern region of Munich. It serves as a hub for our team's creativity and collaboration, featuring state-of-the-art meeting rooms, a fully-equipped electronics lab, and a spacious robotics testing area. Our team also enjoys a variety of social spaces, all within the modern infrastructure of the renowned Neue Balan campus.
  • We are more than just a workplace: we are a community. We encourage connection and affiliation through a range of activities: hiking trips, running events, ping pong tournaments, and quiz nights — there is something for everyone.
  • We also believe that work should be rewarding in more ways than just one. That is why we offer competitive salaries and benefits like EGYM Wellpass, language courses, Jobrad, and flexible working hours.
  • If you are moving to join our team, we provide relocation and visa support to help make the transition as smooth as possible.


ABOUT THE COMPANY

arculus is a part of Jungheinrich and independently develops high-end mobile robots and software products for intralogistics automation. From mechanics to electronics and code – our engineering powerhouse has it all. We combine the speed and creativity of an agile tech company with the strength of a leading global intralogistics player. Collaboration, innovation, and continuous learning: that is how we achieve an open-minded and fast-paced working culture.

COMMITTED TO DIVERSITY AND INCLUSION

We are an equal opportunity employer and highly value diversity and inclusivity, which we see as strengths. While we are making progress, we are not yet where we want to be. Still, we believe in the power of a diverse workforce and welcome applicants of all genders, ethnicities, ages, national origins, sexual orientations, cultures, and educational backgrounds. Our goal is to create a work culture where everyone feels equally heard and included.


Similar Jobs

Explore other opportunities that match your interests

Solutions Architect

Devops
8h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

merantix momentum

Germany
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

brava energia

Germany
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

akeno

Germany

Subscribe our newsletter

New Things Will Always Update Regularly