Cloud Infrastructure Engineer - Life-Safety Alarm Platform

Remote
Apply
AI Summary

Highly skilled engineer responsible for designing, implementing, and maintaining the scalability and reliability of a life-safety alarm platform that services hundreds of thousands of elderly users across Europe. Collaborate closely with the founder to ensure the system is always up and running to prevent real emergencies. Work independently with a strong sense of ownership over the entire infrastructure and develop strong technical skills in C# and Azure.

Key Highlights
Monitor, maintain, and improve the availability and reliability of production systems running on Azure
Own incident response: triage, diagnose, resolve, and document production issues as third-level support
Build and refine observability: alerting, dashboards, log analysis, health checks, so problems surface before users notice
Key Responsibilities
Monitor, maintain, and improve the availability and reliability of production systems running on Azure
Own incident response: triage, diagnose, resolve, and document production issues as third-level support
Build and refine observability: alerting, dashboards, log analysis, health checks, so problems surface before users notice
Manage Azure infrastructure across redundant regions
Read, modify, and extend existing C# / .NET application code to support operational needs
Technical Skills Required
Azure App Service Azure SQL Virtual Machines (Linux and Windows) C# .NET Application Insights Log Analytics
Benefits & Perks
€4,200/month
Fully remote work
Full-time employment
Autonomy in how you solve problems and manage your workload
Sustainable, predictable working environment, no burnout culture
Work that has real-world impact on people's safety
Nice to Have
Azure Service Bus
CI/CD pipelines
Infrastructure as Code (Bicep/Terraform)

Job Description


€4,200/month · Fully Remote within the EU · Full-Time

You'll keep a life-safety emergency alarm platform running - reliably, securely, around the clock. The system manages alarm devices for tens of thousands of elderly users across Europe. When this platform goes down, real emergencies go unanswered. That's the weight of this role, and it's also what makes it meaningful.

This is a two-person operation: the founder and you. No layers of management, no separate ops team, no one else to escalate to. You'll own the infrastructure alongside me. Most days are calm and planned: improving monitoring, hardening systems, optimizing performance, extending the codebase where needed. Then there are the incidents, rare but real, where you need to be fully present, diagnosing quickly and resolving under pressure. It's not constant firefighting, but when it matters, it really matters.

What This Role Is
  • Infrastructure ownership and third-level support for a life-safety critical production system, not a sandbox
  • A mix of operations and C# development, with reliability as the priority
  • Predictable hours with occasional, not constant, urgency
  • Stable, long-term client relationship, no churn, no chaos
  • Fully remote from anywhere in the EU
What This Role Is Not
  • A junior role or learning opportunity
  • A pure development role with no support responsibilities
  • A "work whenever you want" arrangement with no accountability
  • A high-pressure, always-on-fire environment
What You'll Do
  • Monitor, maintain, and improve the availability and reliability of production systems running on Azure
  • Own incident response: triage, diagnose, resolve, and document production issues as third-level support
  • Manage Azure infrastructure across redundant regions: App Service, Azure SQL, Virtual Machines (Linux and Windows)
  • Build and refine observability: alerting, dashboards, log analysis, health checks, so problems surface before users notice
  • Improve security, scalability, and resilience across the platform
  • Read, modify, and extend existing C# / .NET application code to support operational needs, fix bugs, and implement improvements
  • Collaborate on architecture decisions and technical planning
  • Document processes, configurations, runbooks, and incident post-mortems

There's always work in the backlog: improving alerting coverage, reducing manual steps, strengthening redundancy. This isn't a role where you wait for tickets. We need someone who sees what needs improving and moves on it.

What You Need
  • Hands-on Azure production experience, specifically App Service, Azure SQL, and Virtual Machines (Linux and Windows). Not just certifications, real production environments.
  • Working proficiency in C# / .NET. You can read, debug, modify, and extend an existing codebase. You're not expected to architect from scratch, but you need to be comfortable working in the code.
  • Experience with monitoring and observability. Application Insights, Log Analytics, or equivalent. You've configured alerts, built dashboards, and used logs to diagnose production issues.
  • Production incident experience. You understand blast radius, rollback, and the discipline required when something breaks on a system people depend on.
  • Ability to work independently and make sound decisions under pressure
  • Solid English communication skills (written and verbal)
  • Git and Azure DevOps experience
  • Available during core hours: 09:00-17:00 CET

Useful but not required: Azure Service Bus, CI/CD pipelines, Infrastructure as Code (Bicep/Terraform).

Practical Details
  • Employment: Full-time contract. Employment structure depends on your location; we'll work out the details together.
  • Team: This is a two-person operation: the founder and you. There's no career ladder to climb because you'll shape this role as we grow. What there is: full technical ownership, zero politics, and direct influence over every decision.
  • Clients: Our primary client operates the life-safety alarm platform. We also support a smaller number of other clients, different systems, different challenges, which adds variety.
  • Hours: Core hours are 09:00-17:00 CET.
  • Incidents and urgency: We're purely technical third-level support; a 24/7 alarm central handles all user calls. Weekend incidents come up roughly once a quarter, actual outages maybe once a year, and only for genuine emergencies that can't wait until Monday. No formal on-call rotation, no pager duty, nights off. Over time, we'd like to share occasional weekend availability so we can cover each other's holidays, but you won't be on your own.
What We Offer
  • €4,200/month, full-time employment
  • Fully remote from anywhere in the EU
  • Autonomy in how you solve problems and manage your workload
  • A sustainable, predictable working environment, no burnout culture
  • Work that has real-world impact on people's safety
How to Apply

Send your CV to raphael@otten-it.com with the subject line "SRE / Cloud Engineer - EU Remote"

In your application, please answer this question briefly:

Tell us about a time a production system you were responsible for broke. What happened, what did you do, and what changed afterward?

No cover letter needed beyond that. If you want to tell us briefly what draws you to this role, we'd like to read it, but a CV and the answer above are perfectly fine.


Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Mid-Senior level

Stateside

Latin America
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Pentasia

United State

Senior Engineering Leader for Infrastructure and Platform

Devops
3h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

gensyn

United State

Subscribe our newsletter

New Things Will Always Update Regularly