Systems Engineer - AI Coding Agent Evaluation

Mercor • United State
Remote
Apply
AI Summary

Mercor seeks a Systems Engineer to evaluate frontier AI coding agents on complex engineering tasks. The role involves reviewing model-generated code, architecture decisions, and systems implementations while identifying bugs and performance bottlenecks. Candidates must have 2+ years of systems engineering experience and regular use of AI coding tools.

Key Highlights
Frontier AI coding agents evaluation for complex engineering tasks
Review model-generated code, architecture decisions, and systems implementations
Identify bugs, edge cases, performance bottlenecks, and failure modes
Key Responsibilities
Use frontier AI coding agents to complete and evaluate complex engineering tasks
Review model-generated code, architecture decisions, and systems implementations
Identify bugs, edge cases, performance bottlenecks, and failure modes
Compare outputs from multiple frontier models and assess their strengths and weaknesses
Apply professional engineering judgment to realistic systems engineering scenarios
Work independently and asynchronously to meet deadlines while improving AI model performance
Technical Skills Required
distributed systems networking operating systems storage systems infrastructure software database internals AI coding agents (Cursor, Claude Code, Codex, Windsurf, Gemini CLI)
Benefits & Perks
Remote work
Nice to Have
Experience building highly scalable and performance-critical systems

Job Description


About The Job

Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey.

Position: Systems Engineer (Coding Agent Experience)

Type: Contract

Compensation: $85/hour

Location: Remote

Role Responsibilities

  • Use frontier AI coding agents to complete and evaluate complex engineering tasks.
  • Review model-generated code, architecture decisions, and systems implementations.
  • Identify bugs, edge cases, performance bottlenecks, and failure modes.
  • Compare outputs from multiple frontier models and assess their strengths and weaknesses.
  • Apply professional engineering judgment to realistic systems engineering scenarios.
  • Work independently and asynchronously to meet deadlines while improving AI model performance.


Qualifications

Must-Have

  • 2+ years of professional systems engineering experience.
  • Experience with distributed systems, networking, operating systems, storage systems, infrastructure software, or database internals.
  • Regular use of AI coding agents such as Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or similar tools.
  • Ability to evaluate model-generated systems designs and implementations.


Preferred

  • Experience building highly scalable and performance-critical systems.


Compensation & Legal

  • $400 per accepted task
  • Compensation is tied to accepted work.


Application Process (Takes 20–30 mins to complete)

  • Upload resume
  • AI interview based on your resume
  • Submit form


Resources & Support

  • For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome
  • For any help or support, reach out to: support@mercor.com


PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.

Similar Jobs

Explore other opportunities that match your interests

Payroll Specialist

Hr
•
1h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

GE Aerospace

United State

Senior People Business Partner - Go-To-Market Organization

Hr
•
1h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

sundayy

United State

HR Operations Coordinator

Hr
•
1h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

National Quality Systems

United State

Subscribe our newsletter

New Things Will Always Update Regularly