Senior Machine Learning Infrastructure Engineer

audiience™ Oregon Metropolitan Area
Remote
Apply
AI Summary

Transform brilliant research into stable, repeatable, and scalable training infrastructure. Build the machine that builds the machine. Collaborate directly with research to harden experimental approaches into production-ready pipelines.

Key Highlights
Transform brilliant research into stable infrastructure
Build scalable training pipelines
Collaborate with research teams
Key Responsibilities
Architect and own the end-to-end ML training infrastructure
Build scalable, reproducible training pipelines
Collaborate directly with research to harden experimental approaches into production-ready pipelines
Technical Skills Required
Distributed training frameworks PyTorch Cloud compute orchestration Containerization ML experiment tracking Data versioning Reproducibility
Benefits & Perks
Competitive compensation
Benefits
Equity
Generous time off
Flexible working hours
Nice to Have
CUDA
Fused kernels
Low-level performance optimization
LLMs or large-scale foundation model training
Internal ML platforms or developer tooling for research teams

Job Description


About Audiience


We're transforming how content is created and trusted in publishing. We deliver technology that is accurate, scalable, and creative – built to elevate both craft and integrity. We attract the best in the business not through traditional methods, but through the solutions we create and the culture we've built.


Our Culture


  • Low ego, high confidence – We sharpen each other through continuous improvement
  • Open communication – Even when it creates necessary conflict
  • Systems thinking – We solve complex problems through collaboration
  • Human-centered – We work because we love what we do, but we are human first
  • Integrity & creativity – We win together or not at all


The Role


Most AI companies fail not because of bad ideas - but because their infrastructure can't keep pace with their ambition. At Audiience, we've carved out a niche in publishing that has gone largely untouched by AI, and we're building our technology entirely from scratch, with a founding team of engineers who have the rare combination of vision and capability to pull it off.


We're looking for the engineer who understands that great ML doesn't live in notebooks - it lives in systems. You'll join one of the most selective founding engineering teams we've ever assembled: a small, senior group of builders who move fast, think deeply, and hold each other to an extraordinary standard. Your job is to transform brilliant research into stable, repeatable, and scalable training infrastructure. You build the machine that builds the machine.


This is not a role defined by pedigree or years of experience - though those matter. We want builders who are tenacious and self-taught, who thrive in ambiguity, and who can own mission-critical infrastructure from the ground up. If you're energized by building something that has never existed, for a market that has never had it - this seat is yours.


What You'll Do


  • Architect and own the end-to-end ML training infrastructure - from data ingestion through experiment tracking to model checkpointing
  • Build scalable, reproducible training pipelines that empower the research team to iterate fast without chaos
  • Own compute orchestration, distributed training setups, and GPU cluster management
  • Implement and manage experiment tracking (W&B, MLflow) and version-controlled data pipelines
  • Collaborate directly with research to harden experimental approaches into production-ready pipelines
  • Identify and systematically eliminate bottlenecks in training speed, cost, and reliability


What We're Looking For


Core Technical Expertise


  • Deep experience with distributed training frameworks (FSDP, DeepSpeed, Megatron, or equivalent)
  • Strong proficiency in PyTorch and modern ML tooling
  • Experience with cloud compute orchestration (AWS, GCP, or Azure) at training scale
  • Familiarity with containerization (Docker, Kubernetes) for ML workloads
  • Solid understanding of ML experiment tracking, data versioning, and reproducibility
  • Ability to profile and optimize training throughput and resource utilization


Communication


  • Communication excellence – Can clearly articulate infrastructure decisions, tradeoffs, and failure post-mortems in writing
  • Demonstrated ability to document systems architecture and explain technical decisions to a cross-functional team


Background


  • Degree not required
  • Prior experience as an ML infrastructure engineer, MLOps engineer, or applied researcher with strong infra instincts
  • Startup or fast-moving environment experience is a plus


Your Mindset


  • Problem-solving prowess – You see problems others don't and solve them in ways others can't
  • Tenacious learner – Self-taught capabilities and continuous improvement are in your DNA
  • Systems thinker – You understand how complex systems interact and create elegant solutions
  • Results-oriented – Bias toward flexibility, impact, and getting it done
  • Collaborative by nature – You believe we can only win if we do it together


Nice to Have


  • Experience with CUDA, fused kernels, or low-level performance optimization
  • Prior work with LLMs or large-scale foundation model training
  • Experience building internal ML platforms or developer tooling for research teams
  • Open-source contributions or published engineering writeups
  • Previous startup or early-stage engineering experience
  • Volunteer work


Why Join Us


  • Build something that has never existed - in a market that has never had it
  • Join a founding core of technical builders who are redefining what's possible in publishing AI
  • Solve extraordinary problems with fewer resources than competitors - your impact is magnified
  • Work with brilliant misfits who value craft, integrity, and creativity over politics
  • Own end-to-end - from pipeline architecture to production, you have agency and impact
  • Continuous learning - work at the bleeding edge of AI infrastructure with teammates who challenge and sharpen you


Location


This role is fully remote; however, you must be willing to work Pacific Time zone hours. Occasional travel will be required for team workshops.


Come Work With Us!


We offer competitive compensation and benefits, equity, generous time off to recharge, and flexible working hours.


We are an equal opportunity employer committed to building a diverse team. We welcome applications from all backgrounds, especially those who might not check every box but possess the savant-level problem-solving abilities we seek.


Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Storm2

United State

DevOps Engineer

Devops
4h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Cognizant

United State

MLOps Engineer - Kubeflow

Devops
4h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Oliver Bernard

United Kingdom

Subscribe our newsletter

New Things Will Always Update Regularly