Lead quality practices across AI-driven products, focusing on reliability, consistency, and safety. Drive a TDD-first approach and shape QA strategy, automation standards, and AI testing practices. Collaborate with engineers to define test cases and acceptance criteria.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
We’re hiring a Senior QA Automation Engineer to lead quality practices across our AI-driven products. This role is deeply rooted in Test-Driven Development (TDD) and focuses on ensuring reliability, consistency, and safety in systems built on LLMs, APIs, web platforms, and mobile applications.
You won’t operate as a traditional, downstream QA. Instead, you’ll work alongside engineers from the earliest stages of development, helping define test cases, acceptance criteria, and automated validations before code is written, especially for complex AI workflows where regressions, drift, and non-determinism are critical risks.
This is a foundational role where you’ll shape QA strategy, automation standards, and AI testing practices from day one.
Responsibilities
Quality Strategy & TDD Ownership
- Drive a TDD-first approach across backend, frontend, mobile, and AI development.
- Partner with engineers to define test cases, expected behaviors, and acceptance criteria prior to implementation.
- Ensure quality is built into system design, not validated after the fact.
- Identify quality risks early in AI, API, and UI workflows and proactively mitigate them.
Test Automation
- Design, build, and maintain automated test suites for:
- APIs
- Web applications
- Mobile applications (React Native)
- AI / LLM-driven workflows
- Ensure automated tests serve as executable specifications for system behavior.
- Integrate automated test execution into CI/CD pipelines as quality gates.
- Continuously improve test coverage, stability, and execution performance.
LLM & AI Testing
- Define and implement TDD-aligned testing strategies for LLM-based systems, including:
- Prompt validation and regression testing
- Behavioral consistency across prompt or model changes
- Drift detection and output variance analysis
- Validation of Retrieval-Augmented Generation (RAG) pipelines
- Use and help evolve LLM testing and evaluation tools such as Galtea, promptfoo, or similar frameworks.
- Collaborate with AI engineers on:
- Model and prompt migrations
- Agent behavior changes
- Safety, correctness, and edge-case handling
- Enable automated AI validation as part of CI workflows.
Interested in remote work opportunities in QA & Testing? Discover QA & Testing Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
Manual & Exploratory Testing
- Perform targeted exploratory testing for complex AI-driven and cross-system scenarios.
- Validate end-to-end flows from API to frontend when automation alone is insufficient.
- Support release readiness with a strong bias toward automation over manual regression.
Collaboration & Leadership
- Work closely with Engineering, Product, and AI teams to align on expected behaviors and quality standards.
- Advocate for testability-first design and automation-friendly architectures.
- Mentor other engineers and QA team members on TDD, automation, and AI quality practices.
- Help evolve team culture toward shared ownership of quality.
Requirements
Experience
- 5+ years of experience in QA Engineering or Software Engineering with a strong automation and TDD focus.
- Proven experience applying Test-Driven Development in real-world systems.
- Experience testing APIs, web applications, and mobile applications.
Technical Proficiency
- Strong programming or scripting skills (Python or JavaScript preferred).
- Ability to design tests as first-class artifacts that guide development.
- Experience integrating automated tests into CI/CD pipelines.
- Ability to read, understand, and influence production code for better testability.
AI / LLM Knowledge
- Hands-on experience testing AI- or LLM-powered features.
- Strong understanding of:
- Prompt design and prompt versioning
- Model drift and behavioral regressions
- Retrieval-augmented generation (RAG) systems
- Model and prompt migration strategies
- Experience with LLM testing frameworks (e.g. Galtea, promptfoo, or similar), including defining evaluation criteria for non-deterministic outputs.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
Communication
- Fluent English communication.
- Ability to collaborate effectively with engineers, product managers, and AI specialists.
Bonus Skills
- Experience testing React Native mobile applications.
- Experience with frontend automation for modern web applications.
- Familiarity with AI evaluation metrics, scoring strategies, and human-in-the-loop validation.
- Experience in regulated or data-sensitive environments (e.g. fintech).
Our Stack (Representative, Not Exhaustive)
- Frontend: Web applications, React Native
- Backend: APIs
- Automation: Custom automation frameworks, CI-integrated testing
- AI Systems: LLM-powered features, RAG pipelines, agent-based workflows
- LLM Testing: Galtea, prompt-based evaluation frameworks
- Infra: Cloud-native services, Docker
- Data: APIs, databases, vector stores
- CI/CD: Automated pipelines with quality gates
Benefits
- Competitive salary in USD
- Unlimited PTO
- Paid local public holidays
- Fully remote, with availability for an on-site week in Montevideo every 6 to 8 weeks (all travel expenses covered)
- Standard working hours local to you, with flexibility to overlap for some meetings or availability in PST when needed
- US company stock options
Similar Jobs
Explore other opportunities that match your interests
SMX Services & Consulting, Inc...
Blue Bridge People
Senior QA Engineer