AI/ML QA Engineer

lasthire Kosovo
Remote
Apply
AI Summary

Test and evaluate AI systems, ensuring reliability and accuracy. Collaborate with engineers to resolve issues. Develop and maintain quality metrics for AI systems.

Key Highlights
AI System Testing
RAG Validation
Automation Workflow Testing
Key Responsibilities
Test LLM responses, prompt chains, and agent workflows
Validate system outputs for accuracy, reliability, and safety
Test retrieval pipelines and vector database results
Identify hallucinations, incorrect retrieval, and edge cases
Evaluate multi-agent systems interacting with APIs and external services
Simulate real-world user scenarios
Define evaluation metrics for AI systems
Track system performance and failure patterns
Document issues clearly and collaborate with engineers to resolve them
Technical Skills Required
Python LLMs RAG systems vector databases LangChain CrewAI
Benefits & Perks
Flexible Remote Work
Work on Cutting-Edge AI
Early Team Impact
Nice to Have
Experience testing LLM applications
Knowledge of prompt evaluation frameworks
Familiarity with vector databases such as Pinecone, Chroma, or Weaviate

Job Description


About LastHire

We don’t just build software — we build intelligence.

At LastHire, we are building the Autonomous Office: AI systems that actively perform business tasks rather than just answering questions. Our team develops production-grade AI agents, RAG architectures, and workflow automation systems that integrate directly into real business environments.

  • As we scale, we are looking for a QA Engineer for AI/ML systems who will ensure our models, agents, and pipelines perform reliably in real-world scenarios.



The Role

As an AI/ML QA Engineer, you will be responsible for testing, validating, and improving AI-driven systems before they are deployed to production environments.

You will work closely with engineers building LLM-powered applications and automation systems.



Responsibilities

AI System Testing

  • Test LLM responses, prompt chains, and agent workflows
  • Validate system outputs for accuracy, reliability, and safety

RAG Validation

  • Test retrieval pipelines and vector database results
  • Identify hallucinations, incorrect retrieval, and edge cases

Automation Workflow Testing

  • Evaluate multi-agent systems interacting with APIs and external services
  • Simulate real-world user scenarios

Quality Metrics

  • Define evaluation metrics for AI systems
  • Track system performance and failure patterns

Bug Reporting

  • Document issues clearly and collaborate with engineers to resolve them



What We're Looking For

AI/ML Knowledge

  • Understanding of LLMs and prompt engineering
  • Familiarity with RAG systems and vector databases

Technical Skills

  • Python basics
  • Experience testing APIs or software systems
  • Familiarity with tools like LangChain, CrewAI, or similar frameworks is a plus

Analytical Thinking

  • Ability to identify edge cases and unusual model behavior

Communication

  • Clear documentation of testing results and system behavior



Bonus Experience

  • Experience testing LLM applications
  • Knowledge of prompt evaluation frameworks
  • Familiarity with vector databases such as Pinecone, Chroma, or Weaviate



Why Join LastHire?


Work on Cutting-Edge AI

You’ll test and evaluate real production AI systems.

Early Team Impact

As an early contributor, your work will shape the quality standards of the entire platform.

Flexible Remote Work

Work from anywhere with a team focused on speed, innovation, and automation.



How to Apply

Send your application or portfolio through LinkedIn or contact us at:

lasthire.ai@outlook.com


Similar Jobs

Explore other opportunities that match your interests

Senior ML Ops Engineer

Machine Learning
2h ago
Visa Sponsorship Relocation Remote
Job Type Part-time
Experience Level Mid-Senior level

bizmoni - the next gen ai supe...

Mexico

Senior Technical Client Leadership - Machine Learning

Machine Learning
13h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Caylent

Argentina
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

InHousen

India

Subscribe our newsletter

New Things Will Always Update Regularly