Machine Learning Engineer - Observability and Evaluation

neartech search United Kingdom
Remote
Apply
AI Summary

Join a UK-based firm as a remote Machine Learning Engineer to monitor and evaluate AI system performance in production environments. Key responsibilities include debugging model behavior, designing evaluation frameworks, and collaborating with engineering and product teams. The ideal candidate has strong Python engineering skills, experience with LLMs and VLMs, and a collaborative mindset.

Key Highlights
Monitor and evaluate AI system performance
Debug model behavior and edge cases
Collaborate with engineering and product teams
Key Responsibilities
Monitor, assess, and improve AI model performance in production
Debug inconsistent model behaviour and edge cases
Design and implement evaluation frameworks to validate reasoning and decision quality
Technical Skills Required
Python LLMs VLMs Arize Phoenix LangSmith
Benefits & Perks
£80,000 - £100,000 + Equity
Remote work
Equity

Job Description


Machine Learning Engineer - Observability & Evaluation - £80,000 - £100,000 + Equity


I'm working with a UK firm looking for a remote MLE to join them on a permanent basis. The firm works in risk and fraud detection and has a well-adpoted suite of products which are being incorporated into more industries and thus, more use cases. As such they're now looking for a

delivery-focused AI Engineer to join the team, specialising in monitoring and evaluating AI system performance in production environments. This role is suited to someone with a strong background in production-level AI/ML, a keen interest in system observability, and a collaborative mindset.


Note: This isn't a founder role, they're looking for a low-ego individual who can work collaboratively as part of a remote team within the UK. For candidates, it would be really beneficial if you've come from a start-up / scale-up background and have worked in a similar fashion.


Key Responsibilities:

  • Monitor, assess, and improve AI model performance in production.
  • Debug inconsistent model behaviour and edge cases.
  • Design and implement evaluation frameworks to validate reasoning and decision quality.
  • Work with tools such as Arize Phoenix, LangSmith, or similar observability platforms.
  • Collaborate with engineering and product teams to ensure robust monitoring and reporting.


Requirements:

  • Strong Python engineering skills, with a focus on reliable, production-ready code.
  • Experience with LLMs (Large Language Models) and VLMs (Vision Language Models).
  • Hands-on experience with production observability tools - Arize, Langsmith
  • Ability to design and execute evaluation methodologies for AI systems.
  • Strong problem-solving skills and attention to detail.
  • Collaborative, low-ego mindset; able to work effectively in a small, remote team.


The role is fully remote but only open to UK based candidates without need of visa sponsorship.


Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Mid-Senior level

Tria

United Kingdom

Machine Learning Expert

Machine Learning
1w ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Mindrift

United Kingdom

Machine Learning Manager

Machine Learning
2w ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

tem

United Kingdom

Subscribe our newsletter

New Things Will Always Update Regularly