MLOps Data Engineer

triton digital United State
Remote
Apply
AI Summary

Design and automate CI/CD pipelines, optimize large-scale data processing, and leverage Databricks to deliver machine learning solutions. Collaborate with data scientists to productionize ML models. Champion best practices in version control, model registry management, and environment reproducibility.

Key Highlights
Design and automate CI/CD pipelines
Optimize large-scale data processing
Leverage Databricks to deliver machine learning solutions
Key Responsibilities
Design and implement CI/CD pipelines for machine learning workflows
Build and optimize data processing pipelines in Apache Spark for large-scale, distributed listener datasets
Deploy and manage Databricks environments, ensuring efficient cluster usage, job scheduling, and cost optimization
Collaborate with data scientists to productionize ML models, integrating them into scalable APIs or batch processing systems that feed real-time, machine-readable audience signals
Implement automated testing, monitoring, and alerting for ML pipelines to ensure the reliability and reproducibility that certified buyers require
Champion best practices in version control, model registry management, and environment reproducibility
Help evolve our listener data infrastructure toward agent-compatible supply — live, structured, queryable data feeds that autonomous buying systems can discover and act on without human mediation
Technical Skills Required
Python Apache Spark Databricks Delta Lake Structured Streaming Feature Engineering GitHub Actions Azure DevOps Jenkins GitLab CI ArgoCD AWS Azure GCP Kubernetes OpenShift
Benefits & Perks
Fully remote position
4 weeks of vacation + 5 paid personal days annually
Group insurance programs
Collective RRSP with matching contribution
Internet reimbursement
Nice to Have
Familiarity with IAB data standards, programmatic advertising infrastructure, or AdTech data pipelines

Job Description


We can only accept candidates based in Ontario or Quebec.

The Audio Market Opportunity

Modern advertising marketers allocate spending through automated systems that interpret signals. For a channel to capture its fair share of budget, its inventory must be legible to those systems — standardized signals, structured metadata, and machine-readable supply pathways.

For the next evolution of media buying, Audience Signal is even more consequential. Agentic buying — autonomous systems that independently interpret objectives, evaluate options, negotiate terms, and execute campaigns — is moving from concept to production. These systems don’t browse inventory the way a human planner does. They query structured environments, evaluate supply through machine-readable signals, and pass over inventory they cannot read.

Our Mission

Triton Digital builds the infrastructure layer that makes audio inventory legible to modern — and next-generation — advertising markets. Our platform enables broadcasters, independent podcasters, and streaming music services to participate in automated buying on equal terms with the major platforms, aggregating over 100 billion audio impressions per month across podcast, streaming, and broadcast radio inventory.

The listener data team is at the heart of that mission. We enrich the listener profile to enable better advertising targeting through services including integration with Data Management Platforms (DMPs), the Profiler, the GeoIP service, and any other systems that serve the goal of making listener audiences continuously discoverable and actionable for buyers.

The Role

As our MLOps Data Engineer, you’ll be the bridge between data science and production systems — ensuring that models don’t just work in notebooks but thrive in real-world environments. You’ll design and automate CI/CD pipelines, optimize large-scale data processing with Apache Spark, and leverage Databricks to deliver machine learning solutions that are reliable, scalable, and fast. Your work will directly determine how quickly we can turn listener intelligence into structured, queryable signals that advertising systems — today’s DSPs and tomorrow’s agentic buyers — can act on.

What You’ll Do

  • Design, implement, and maintain CI/CD pipelines for machine learning workflows using tools like GitHub Actions, Azure DevOps, or Jenkins.
  • Build and optimize data processing pipelines in Apache Spark (PySpark and Scala) for large-scale, distributed listener datasets.
  • Deploy and manage Databricks environments, ensuring efficient cluster usage, job scheduling, and cost optimization.
  • Collaborate with data scientists to productionize ML models, integrating them into scalable APIs or batch processing systems that feed real-time, machine-readable audience signals.
  • Implement automated testing, monitoring, and alerting for ML pipelines to ensure the reliability and reproducibility that certified buyers require.
  • Champion best practices in version control, model registry management, and environment reproducibility.
  • Help evolve our listener data infrastructure toward agent-compatible supply — live, structured, queryable data feeds that autonomous buying systems can discover and act on without human mediation.


What You’ll Bring

  • Proven experience in Data Engineering, MLOps, and DevOps roles with a focus on automation and scalability.
  • Strong programming skills in Python, with hands-on experience in Apache Spark. Scala is a huge plus.
  • Advanced expertise in Databricks, including Delta Lake, structured streaming, feature engineering
  • Solid understanding of CI/CD principles and tools (e.g., GitHub Actions, Jenkins, Azure DevOps, GitLab CI, ArgoCD).
  • Familiarity with cloud platforms (AWS, Azure, or GCP) for data and ML workloads.
  • A problem-solving mindset and the ability to work closely with cross-functional teams.
  • Strong architectural mindset, capable of evaluating trade-offs across cost, performance, scalability, and maintainability when selecting tools and designing systems.
  • Experience working with containerized and orchestrated environments (Kubernetes / OpenShift), including deployment, scaling, and fault tolerance of data and ML workloads.
  • Advanced English required. French is an asset.
  • Familiarity with IAB data standards, programmatic advertising infrastructure, or AdTech data pipelines is a strong asset.


Our benefits package includes

  • Fully remote position (must be based in ONTARIO or QUEBEC)
  • 4 weeks of vacation + 5 paid personal days annually
  • Group insurance programs as of your first day, including access to telemedicine and an EAP
  • Collective RRSP with matching contribution
  • Internet reimbursement and more
  • L’utilisation de l’anglais est nécessaire pour collaborer avec des équipes internes et internationaux, et pour accéder à des informations et des ressources.


Triton Digital is an equal opportunity employer committed to fostering a diverse, equitable, and inclusive workplace where all employees are respected, supported, and enabled to perform at their highest potential.

Similar Jobs

Explore other opportunities that match your interests

Mid-Level Data Analyst

Data Science
6h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

TEKsystems

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Director

Jobgether

United State

Power BI Developer with Talent Acquisition Data Analytics and DAX Experience

Data Science
7h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

TEKsystems

United State

Subscribe our newsletter

New Things Will Always Update Regularly