Platform Engineer - Observability

HelloFresh • Germany
Relocation
Apply
AI Summary

Build infrastructure automation at scale for 1000+ engineers, leveraging best-of-breed open-source and managed tools to deliver high-quality self-service observability solutions.

Key Highlights
Architect and build infrastructure automation at scale
Drive positive change in change failure rate, mean time to detect, and mean time to restore metrics
Consult and cooperate with engineers in developing and implementing best practices in observability and incident management
Technical Skills Required
Prometheus OpenTelemetry Grafana Datadog Terraform Kubernetes Go Python
Benefits & Perks
Competitive compensation package
HelloFresh-subsidized Pension Scheme
Berlin relocation support
Hybrid working model
Exclusive discounts on HelloFresh box and office meals
German language learning budget
Access to HelloFresh Academy
Mental health support
Transportation perks
Working-parent-friendly benefits

Job Description


The role

The Foundations Alliance builds the tools, services, systems, and infrastructure that engineering teams across HelloFresh use daily.

As a Platform Engineer in the Observability team, you will play a key role in upholding high reliability and operational standards of business-critical components across HelloFresh. You will build robust infrastructure and self-service tools to empower HelloFresh engineers to uphold the reliability and performance of their services.

We'd love to hear from you if you’re passionate about reliability, observability, and automation!

What You’ll Do

  • Architect and build infrastructure automation at scale for 1000+ engineers
  • Leverage best-of-breed open-source and managed tools to deliver high-quality self-service observability solutions
  • Drive positive change in change failure rate, mean time to detect, and mean time to restore metrics
  • Apply AI-driven, LLM, and agentic capabilities to proactively surface issues, reduce mean time to detect (MTTD), and enable faster, data-informed incident resolution
  • Consult and cooperate with engineers in developing and implementing best practices in the field of observability and incident management

What You’ll Bring

  • Experience with observability tools and infrastructure (e.g. Prometheus, OpenTelemetry, Grafana, Datadog)
  • Experience working with SRE best practices and principles (e.g. SLIs/SLOs, incident management, etc.)
  • Familiarity with at least one cloud platform (e.g. AWS, Azure, or GCP) and its services
  • Familiarity with Kubernetes or other container orchestration platforms
  • Experience with infrastructure-as-code tools such as Terraform
  • Experience building highly available and observable systems at scale (preferably in Go or Python)

What We Offer

Elevate your lifestyle! Join one of Europe's fastest-growing tech powerhouses in a dynamic phase of expansion.

  • Immerse yourself in a diverse global community of 90+ nationalities.
  • Enjoy a competitive compensation package that goes beyond the norm, with perks like a HelloFresh- subsidized Pension Scheme, Berlin relocation support, and a Hybrid working model.
  • Elevate your lifestyle with exclusive discounts on your weekly HelloFresh box and office meals.
  • Invest in your growth with a German language learning budget, and access to the HelloFresh Academy.
  • Plus, we've got your well-being covered with mental health support, transportation perks, and working-parent-friendly benefits. From our 24/7 gym access,wellbeing platforms like Headspace and Spill, to sabbatical leave options, HelloFresh is not just a workplace; it's a lifestyle of perks and possibilities!

Subscribe our newsletter

New Things Will Always Update Regularly