Senior ML/AI Systems Engineer (Fractional CTO-Level IC)

Braintrust • Namer

Remote

This Job is No Longer Active This position is no longer accepting applications

AI Summary

Design and implement AI-powered systems for physician credentialing, ensuring scalability, security, and reliability. Collaborate with the team to develop a multi-agent verification system using LLM infrastructure. Contribute to the architecture, orchestration, and decision logic of the system.

Key Highlights

Design and implement AI-powered systems for physician credentialing

Collaborate with the team to develop a multi-agent verification system

Contribute to the architecture, orchestration, and decision logic of the system

Key Responsibilities

Establish production monitoring and observability

Review architecture for security and scalability

Optimize agent performance and cost

Build orchestration, decision logic, and escalation systems

Ensure readiness for scale (25K+ physician wallets, 150+ enterprise customers)

Technical Skills Required

LLM infrastructure TypeScript/Node.js PostgreSQL APIs Job queues Orchestration: LangChain/LlamaIndex-style frameworks Security: encryption, access controls, audit logging Observability: metrics, logging, alerting, distributed debugging

Benefits & Perks

$8K+/month compensation

Flexible schedule with core overlap

Potential conversion to full-time CTO role with 0.25%-6% equity

Nice to Have

Healthcare compliance (PII, Joint Commission, NCQA)

Web automation/scraping (Playwright, Puppeteer)

Multi-agent coordination patterns

LLM cost optimization

Job Description

Job Description

Senior ML / AI Systems Engineer (Fractional CTO-Level IC)

Location: Remote (US-based preferred)

Commitment: Full-time preferred; ½ time (~20 hrs/week) possible with flexibility to scale

Compensation: $8K+/month depending on experience + hours + performance bonuses + potential equity

Contract Type: 1099 Independent Contractor

Reports To: Chief Product Officer

About Evercred

Evercred is transforming physician credentialing from a 90–120 day manual process into a ~14-30 day automated workflow using AI agents. We are building a production system that orchestrates multi-step verification across state medical boards, education institutions, and employers—with real-time monitoring, intelligent escalation, and compliant audit trails.

Stage: Pre-revenue, 7 signed LOIs (~$250K), targeting $1M ARR by Summer 2026

Stack: Next.js, TypeScript, PostgreSQL, Prisma, Anthropic Claude API, HashiCorp Vault, LangChain-style orchestration

The Role

This is a fractional CTO-level individual contributor role focused on architecture, orchestration, observability, and reliability of a multi-agent verification system.

We are working with an external agency on agent implementation and need a senior in-house engineer to:

Establish production monitoring and observability
Review architecture for security and scalability
Optimize agent performance and cost
Build orchestration, decision logic, and escalation systems
Ensure readiness for scale (25K+ physician wallets, 150+ enterprise customers)

Why this is interesting

First production AI agent system in healthcare credentialing
High-impact problem: eliminating 90–120 day bottlenecks that cost hospitals millions
True multi-agent orchestration (parallel workflows, failure modes, latency variance)
Non-deterministic systems engineering on LLM infrastructure
Regulated environment: HIPAA-adjacent, Joint Commission auditability, long-term retention

What You’ll Do

Phase 1: Monitoring & Architecture Review

(Weeks 1–6 ½-time | 1–3 weeks full-time)

Observability & reliability

Implement monitoring for agent workflows (performance, cost, success/failure)
Build dashboards, alerting, and escalation detection
Create debugging tools for non-deterministic agent behavior

Security & compliance

Review architecture for sensitive PII handling (SSN, credentials, portals)
Validate encryption, access controls, audit logging

Deliverables:

Monitoring dashboards, alerting system, security review doc, cost projection model

Phase 2: Orchestration & Decision Logic

(Weeks 7–12 ½-time | 4–8 weeks full-time)

Agent orchestration

Design multi-agent workflow orchestration
Implement job queues, polling, webhooks, retries, timeouts
Build agent health monitoring and failover to manual workflows

Decision & escalation intelligence

Auto-verify vs manual review logic
Confidence scoring (data quality, source reliability, discrepancies)

Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.

Explainable escalation routing

Performance & integrations

Prompt optimization, caching, and cost controls
API integrations (FSMB PDC, ABMS CertiFACTS, NPI, OIG/SAM, others)
Structured/unstructured response parsing

Deliverables:

Orchestration framework, decision logic system, 3+ API integrations live, 20–30% cost-per-verification reduction

Phase 3: Scale & Production Readiness (Ongoing)

Scale to 25K+ wallets and 150+ organizations
Rate limiting, circuit breakers, graceful degradation
Immutable audit trails with 7-year retention
Agent performance analytics, anomaly detection
Benchmarking and regression testing

What You’ll Bring

Required (5–8 years)

Production AI systems

Shipping LLM-powered systems to production
Building reliability on non-deterministic models
Debugging hallucinations, regressions, agent failures
AI observability and monitoring

Technical

LLM systems: prompting, error handling, cost management
Backend: TypeScript/Node.js, PostgreSQL, APIs, job queues
Orchestration: LangChain/LlamaIndex-style frameworks
Security: encryption, access controls, audit logging
Observability: metrics, logging, alerting, distributed debugging

Strongly Preferred

Healthcare compliance (PII, Joint Commission, NCQA)
Web automation/scraping (Playwright, Puppeteer)
Multi-agent coordination patterns
LLM cost optimization

Work Style

Comfortable in pre-revenue ambiguity
Balances speed with compliance
Strong async communication
Effective with external contractors
Pragmatic about technical debt

What You’ll Get

Impact

Architect the intelligence layer of a first-to-market AI verification system
Direct contribution to revenue milestones and scale

Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.

Learning

Deep production LLM and agent orchestration experience
Regulated healthcare systems engineering
Fractional CTO-level strategic ownership

Flexibility

Fully remote, async-first
Deliverables over hours
Flexible schedule with core overlap

Compensation

$8K+/month (scales with hours + experience)
Potential conversion to full-time CTO role
0.25%–6% equity (PT vs FT 1-year cliff)

How We Work

Kanban, weekly delivery cycles
Weekly demos (async-recorded)
Daily async standups
Weekly sync (45 min)
Tools: GitHub, Slack
Definition of Done: merged, deployed, documented/demoed
Target: 5–7 medium stories/week (quality > quantity)

Interview Process

Intro call (30 min)
Technical deep-dive (60 min)
Async code/architecture review
Culture fit (30 min)
Paid trial (1 week) — ship monitoring or orchestration component

Timeline: 1–2 weeks end-to-end

To Apply

Email: help@hpec.io

Subject: ML Engineer

Include:

Resume/LinkedIn
GitHub/code samples
2–5 min video covering:
Production LLM experience
Agent orchestration challenge
Interest in Evercred
Availability And start date

Team

Mark (CPO): Product, engineering coordination.
Leah (CEO): Founder, Emergency Physician, vison, strategy, partnerships, sales GTM
Engineering: Samil (Full Stack), Elena (Full Stack), Adil (DevOps) Kaab (QA)
External Agency: Building initial verification agents (education, employment, licensing)

Job Overview

Posted Date Feb 20, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location Namer

Annual Salary 96,000 USD

Category Programming

Company Braintrust

Mentioned Skills

Industries

Similar Jobs

Explore other opportunities that match your interests

Applied AI Engineer - AI Platform Team

Programming

•

15h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Zapier

Namer

Technical Recruiting Manager

Programming

•

2d ago

Visa Sponsorship Relocation Remote

Job Type Internship

Experience Level Not Applicable

TRM Labs

Namer

Senior Analytics Engineer - Advanced Analytics Team

Programming

•

5d ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

TRM Labs

Namer

Senior ML/AI Systems Engineer (Fractional CTO-Level IC)

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Applied AI Engineer - AI Platform Team

Premium Job

Zapier

Technical Recruiting Manager

TRM Labs

Senior Analytics Engineer - Advanced Analytics Team

Premium Job

TRM Labs

Subscribe our newsletter