Principal Machine Learning Engineer

Remote
Apply
AI Summary

Drive the strategy, design, and execution of next-generation ML systems across the company. Define the ML roadmap, build scalable infrastructure, mentor senior engineers, and influence product and business strategy. Partner with leadership across Engineering, Product, and Data Science to deliver systems that directly powers their AI products.

Key Highlights
Strategic Leadership & Technical Vision
Distributed ML Systems & Model Development
MLOps & Production Excellence
Key Responsibilities
Define and drive the ML strategy across multiple product lines, aligning technical decisions with business objectives and KPIs.
Lead technical design reviews, enforce engineering standards, and guide adoption of modular, maintainable, and scalable ML infrastructure.
Mentor and grow senior and mid-level ML engineers; foster cross-functional collaboration with Product, Data Science, and Engineering teams.
Technical Skills Required
PyTorch PySpark Ray Databricks Delta Lake Unity Catalog MLflow CI/CD Kubernetes AWS (S3, EC2) Snowflake Prometheus Grafana Datadog
Benefits & Perks
$240,000 - $280,000 base + benefits
Nice to Have
LLMs
vector embeddings
reinforcement learning
feature stores

Job Description


Principal Machine Learning Engineer

Location: Remote - U.S Bases - not eligible for VISA transfer/sponsorship
Industry: Ad Tech

Salary: $240,000 - $280,000 base + benefits

About the Role

I have partnered with a leading AI platform in the programmatic ad buying space. A recognized leader in space; they are seeking a Principal Machine Learning Engineer to drive the strategy, design, and execution of next-generation ML systems across the company.

This is a hands-on, high-leverage role for a senior technical leader with deep experience in distributed ML, large-scale MLOps, and ad tech systems. You will define the ML roadmap, build scalable infrastructure, mentor senior engineers, and influence product and business strategy.

As a principal-level engineer, you will not only lead technical execution but also shape the vision for AI products, ML infrastructure, and observability frameworks. You will partner with leadership across Engineering, Product, and Data Science to deliver systems that directly powers their AI products.

Key Responsibilities:

Strategic Leadership & Technical Vision

  • Define and drive the ML strategy across multiple product lines, aligning technical decisions with business objectives and KPIs.
  • Set best practices for architecture, design, and deployment of distributed ML systems in a fast-growing startup environment.
  • Lead technical design reviews, enforce engineering standards, and guide adoption of modular, maintainable, and scalable ML infrastructure.
  • Mentor and grow senior and mid-level ML engineers; foster cross-functional collaboration with Product, Data Science, and Engineering teams.

Distributed ML Systems & Model Development

  • Architect, implement, and optimize large-scale neural network systems for audience modeling, bid optimization, and real-time decisioning.
  • Lead multi-GPU, distributed training pipelines using PyTorch + Ray (Train, Tune, DDP), including automated hyperparameter search (ASHA, early stopping).
  • Design robust feature engineering pipelines with PySpark and embedding layers for categorical, behavioral, and contextual features.
  • Establish system-wide standards for model evaluation, champion/challenger workflows, and performance benchmarking.

MLOps & Production Excellence

  • Own the end-to-end ML lifecycle, from training and batch inference to monitoring, observability, and automated rollback/recovery.
  • Architect fault-tolerant, reproducible ML pipelines leveraging Databricks, Delta Lake, Unity Catalog, MLflow, and cloud platforms (AWS S3, EC2).
  • Define and implement model versioning, artifact management, experiment tracking, and observability standards across products.
  • Collaborate with engineering teams to optimize production dataflows, ensure high availability, and scale infrastructure for multiple product lines.

Innovation & Future ML Capabilities

  • Evaluate and integrate emerging ML technologies (LLMs, vector embeddings, reinforcement learning, large-scale ETL/ELT, feature stores).
  • Explore new approaches to programmatic optimization, audience modeling, and AI-driven bid strategies.
  • Provide technical leadership in cross-functional planning and product roadmap discussions; influence strategic decisions on ML infrastructure and AI products.

Required Qualifications:

  • Master's or PhD in Computer Science, Statistics, Machine Learning, or related field with 10+ years of ML engineering experience, including distributed systems.
  • Deep expertise in PyTorch (custom architectures, embedding layers, MLPs, binary classification heads).
  • Proven experience designing and deploying production ML systems at scale (Databricks, Delta Lake, Unity Catalog, Ray, MLflow).
  • Expert-level Python and PySpark skills; experience with large-scale feature engineering and batch inference pipelines.
  • Strong MLOps knowledge: versioning, monitoring, reproducibility, model serving, observability (Prometheus, Grafana, Datadog).
  • Cloud platform experience (AWS S3, EC2) and data warehousing (Snowflake).
  • Experience building and mentoring teams, leading cross-functional projects, and influencing product and technical strategy.
  • Strong communication skills and ability to work with senior stakeholders in fast-paced environments.
  • AdTech/programmatic advertising experience (DSPs, bid optimization, lookalike modeling).
  • Experience with LLMs, vector embeddings, reinforcement learning, feature stores, or clean rooms.
  • Distributed training across multi-GPU clusters using Ray (Train, Tune, Datasets).
  • Experience deploying ML in Kubernetes-based environments and integrating event-driven messaging systems (SQS, SNS, MSK, Red Panda).
  • Experience in CI/CD automation, internal ML libraries, and observability tooling at scale.
Desired Skills and Experience

Distributed ML & MLOps

PyTorch, PySpark, Ray

Databricks, Delta Lake, Unity Catalog

MLflow, CI/CD, Kubernetes

AWS (S3, EC2), Snowflake

Multi-GPU training & hyperparameter tuning

AdTech / Programmatic Advertising

Team leadership & cross-functional collaboration

Observability: Prometheus, Grafana, Datadog

LLMs, embeddings, reinforcement learning

Sphere Digital Recruitment currently have a variety of job opportunities across digital so feel free to get in touch with us to find out how we can help you. Please take a look at our website.


Sphere is an equal opportunities employer. We encourage applications regardless of ethnic origin, race, religious beliefs, age, disability, gender or sexual orientation, and any other protected status as required by applicable law.


If you require any adjustments or additional support during the recruitment process for any reason whatsoever, please let us know.


Similar Jobs

Explore other opportunities that match your interests

Senior Machine Learning Engineer (NLP Focus)

Machine Learning
•
6h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

chatgpt jobs

United State

AI Solutions Architect

Machine Learning
•
3d ago
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Mid-Senior level

axius technologies inc.

United State

Senior AI Engineer

Machine Learning
•
4d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Modus Create

United State

Subscribe our newsletter

New Things Will Always Update Regularly