AI Summary
We're looking for a Machine Learning Engineer with experience in reinforcement learning to advance how we train, evaluate, and deploy embodied AI behaviors.
Key Highlights
Design and optimize end-to-end pipelines for training reward models and RL agents
Develop tooling for data processing, annotation, and inference within RL workflows
Build, refine, and deploy reward models that encode safe, interpretable, and effective driving behaviors
Technical Skills Required
Benefits & Perks
Attractive compensation with salary and equity
Immersion in a team of world-class researchers, engineers, and entrepreneurs
Bespoke learning and development opportunities
Relocation support with visa sponsorship
Flexible working hours
Onsite chef, workplace nursery scheme, private health insurance, therapy, daily yoga, onsite bar, large social budgets, unlimited L&D requests, enhanced parental leave
Job Description
At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law.
About Us
Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.
Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving.
In our fast-paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future.
At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.
Make Wayve the experience that defines your career!
The role
We’re looking for a Machine Learning Engineer with strong experience in reinforcement learning (RL), reward modeling, and large-scale ML systems to advance how we train, evaluate, and deploy embodied AI behaviors. This role sits at the intersection of ML engineering, applied RL research, and ML systems, working on the frameworks that guide how our autonomous agents learn from data, simulation, and real-world experience.
As an MLE on the Accelerated Learning Loop team, you will:
- Design and optimise end-to-end pipelines for training reward models and RL agents, ensuring they are reproducible and high-throughput.
- Develop tooling for data processing, annotation, and inference within RL workflows.
- Build, refine, and deploy reward models that encode safe, interpretable, and effective driving behaviours.
- Integrate reward models with diverse data sources: real-world trajectories, simulation, and synthetic datasets.
- Conduct ablations, hyperparameter explorations, and controlled studies to analyse how reward structures, data composition, and training dynamics affect policy performance.
- Diagnose failure modes, investigate emergent behaviours, and iterate on reward objectives to improve reliability.
- Work closely with RL scientists to translate research ideas into scalable engineering solutions.
- Partner with evaluation teams to integrate reward and RL models into offline/online testing suites and simulation frameworks.
- Establish best practices around code quality, reproducibility, and deployment readiness.
- Build internal tools and visualisations that enable faster debugging, deeper insights, and more efficient iteration across the RL and reward modeling stack.
- This role is ideal for someone who enjoys building systems and running fast, grounded experiments. Someone who is motivated by delivering real impact on the behaviour of embodied AI systems in the real world.
- Experience applying reinforcement learning techniques, including offline RL, reward modeling, RLHF-style approaches, or similar
- Proficiency in Python and modern ML frameworks (e.g., PyTorch, JAX, Ray, or equivalent)
- Experience building ML pipelines or large-scale training workflows in production or research environments
- Strong understanding of simulation environments and/or real-world behavioural data
- Ability to design and run experiments, analyse results, and turn insights into actionable improvements
- Strong problem-solving skills and the ability to work effectively in cross-functional teams
- Experience contributing to research (e.g., publications at NeurIPS, ICLR, CoRL, CVPR)
- Understanding of self-driving technologies, sensor data, or real-time decision-making algorithms
- Experience with distributed training systems and cloud compute environments (Azure, AWS, GCP)
- Exposure to large-scale simulation, embodied AI, or robotics systems
- Attractive compensation with salary and equity
- Immersion in a team of world-class researchers, engineers and entrepreneurs
- A unique position to shape the future of autonomy and tackle the biggest challenge of our time
- Bespoke learning and development opportunities
- Relocation support with visa sponsorship
- Flexible working hours - we trust you to do your job well, at times that suit you and your time
- Benefits such as an onsite chef, workplace nursery scheme, private health insurance, therapy, daily yoga, onsite bar, large social budgets, unlimited L&D requests, enhanced parental leave, and more!
We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you’re passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.
For more information visit Careers at Wayve.
To learn more about what drives us, visit Values at Wayve
DISCLAIMER: We will not ask about marriage or pregnancy, care responsibilities or disabilities in any of our job adverts or interviews. However, we do look to capture information about care responsibilities, and disabilities among other diversity information as part of an optional DEI Monitoring form to help us identify areas of improvement in our hiring process and ensure that the process is inclusive and non-discriminatory.