Senior Backend Engineer for Machine Learning Infrastructure and Reliability

SR2 | Socially Responsible Recruitment | Certified B Corporation™ • Emea

Remote

This Job is No Longer Active This position is no longer accepting applications

AI Summary

Design, build, and operate production Django services that orchestrate distributed ML workflows. Build high-throughput async job processing systems and implement reliability patterns. Collaborate with ML teams to productionise training and inference pipelines.

Key Highlights

Design and maintain Django services supporting ML inference workflows

Build high-throughput async job processing systems using queues and schedulers

Implement reliability patterns including retries, idempotency, rate limiting, and backpressure

Key Responsibilities

Design and maintain Django services supporting ML inference workflows

Build high-throughput async job processing systems using queues and schedulers

Implement reliability patterns including retries, idempotency, rate limiting, and backpressure

Own observability strategy including metrics, tracing, logging, and alerting

Lead incident response and drive long-term reliability improvements

Collaborate with ML teams to productionise training and inference pipelines

Support CI/CD and infrastructure automation using Infrastructure as Code

Technical Skills Required

Python Django Celery RQ Arq AWS GCP Terraform Postgres Redis

Benefits & Perks

Fully remote within CET time zone

High autonomy and strong technical ownership

Nice to Have

Experience operating ML infrastructure or MLOps platforms

Familiarity with orchestration tools (Airflow, Temporal, Prefect, Step Functions)

Experience with observability stacks such as Prometheus, Grafana, or OpenTelemetry

Job Description

⚙️ Senior Backend Engineer – ML Infrastructure & Reliability

📍 Remote (CET) | Full-Time

A high-growth AI technology company is building large-scale machine learning platforms that power content generation for global enterprise brands. Their production systems coordinate high-throughput ML inference across multiple services and external providers.

They are hiring a Senior Backend Engineer to take ownership of reliability, orchestration, and performance across their core backend platform.

💻 The Role

You will design, build, and operate production Django services that orchestrate distributed ML workflows. The focus is on building highly reliable backend systems capable of handling asynchronous processing at scale.

🛠 Key Responsibilities

• Design and maintain Django services supporting ML inference workflows

• Build high-throughput async job processing systems using queues and schedulers

• Implement reliability patterns including retries, idempotency, rate limiting, and backpressure

Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.

• Own observability strategy including metrics, tracing, logging, and alerting

• Lead incident response and drive long-term reliability improvements

• Collaborate with ML teams to productionise training and inference pipelines

• Support CI/CD and infrastructure automation using Infrastructure as Code

✅ Requirements

• Strong Python backend engineering background

• Proven experience running Django applications in production

• Experience building asynchronous processing systems (Celery, RQ, Arq or similar)

• Solid understanding of distributed systems reliability principles

• Experience with AWS or GCP cloud environments

• Practical Infrastructure as Code experience (Terraform or similar)

Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.

⭐ Nice To Have

• Experience operating ML infrastructure or MLOps platforms

• Familiarity with orchestration tools (Airflow, Temporal, Prefect, Step Functions)

• Experience with observability stacks such as Prometheus, Grafana, or OpenTelemetry

• Experience scaling Postgres or caching systems like Redis

🌟 Why Join

• Own reliability of business-critical AI production systems

• Solve complex distributed systems challenges

• Work closely with ML and backend engineering teams

• Fully remote within CET time zone

• High autonomy and strong technical ownership

📩 If you are interested, message me directly or apply via LinkedIn.

Job Overview

Posted Date Feb 06, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location Emea

Category Programming

Company SR2 | Socially Responsible Recruitment | Certified B Corporation™

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Freelance Angular Developer (Remote, 6-month Contract)

Programming

•

7m ago

Visa Sponsorship Relocation Remote

Job Type Contract

Experience Level Associate

virtotech

Emea

Senior AI/LLM Engineer (Ruby on Rails)

Programming

•

2d ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

OnTheGoSystems

Emea

Frontend Developer (Vue.js) - Contract

Programming

•

3d ago

Visa Sponsorship Relocation Remote

Job Type Contract

Experience Level Not Applicable

hire feed

Emea

Senior Backend Engineer for Machine Learning Infrastructure and Reliability

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Freelance Angular Developer (Remote, 6-month Contract)

virtotech

Senior AI/LLM Engineer (Ruby on Rails)

OnTheGoSystems

Frontend Developer (Vue.js) - Contract

hire feed

Subscribe our newsletter