Founding Engineer, ML Inference

reactor • San Francisco Bay Area

Visa Sponsorship Relocation

Apply

AI Summary

We're seeking a highly technical Founding Engineer with expertise in high-performance ML engineering to drive real-time model performance for diffusion models. This role involves designing a high-performance in-house inference runtime and optimizing neural network models for inference. Collaboration with model partner teams is also essential.

Key Highlights

Real-time model performance for diffusion models

High-performance in-house inference runtime

Optimizing neural network models for inference

Key Responsibilities

Drive our frontier position on real-time model performance for diffusion models.

Design and implement a high-performance in-house inference runtime.

Implement optimizations using torch.compile, custom CUDA kernels, and specialized inference frameworks.

Technical Skills Required

PyTorch TensorRT TransformerEngine Nsight ONNX Runtime torch.compile CUDA kernels specialized inference frameworks

Benefits & Perks

Competitive San Francisco salary

Meaningful early equity

Health, dental, and vision coverage

Relocation support

Job Description

About Us

We're building a future where anyone can create interactive media applications that delight, educate, and simulate. Building a new kind of platform for real-time generative media, enabling developers to go from idea to immersive, dynamic experience in seconds. Join a small, focused team of YC and unicorn founders and senior engineers with deep expertise in 3D, generative video, developer platforms, and creative tool, aspiring to continuously push the boundaries of what's possible.

About the Role

We're looking for a Founding Engineer, ML Inference with deep expertise in high-performance ML engineering. This is a highly technical, high-impact role focused on squeezing every drop of performance from generative media models.

You'll work across the model-serving stack, designing novel inference frameworks, optimizing inference performance, and shaping the competitive edge in ultra-low-latency, high-throughput environments.

What You'll Do

Drive our frontier position on real-time model performance for diffusion models
Design and implement a high-performance in-house inference runtime

Looking to advance your IT & Network Engineering career with relocation support? Explore IT & Network Engineering Jobs with Relocation Packages that include comprehensive packages to help you move and settle in your new role.

Implement optimizations using torch.compile, custom CUDA kernels, and specialized inference frameworks
Optimize neural network models for inference through quantization, pruning, and architectural modifications while maintaining accuracy
Profile and benchmark model performance to identify computational bottlenecks
Collaborate directly with model partner teams to directly integrate their models into our platform

Required Skills

Strong foundation in systems programming, with a track record of identifying and resolving bottlenecks
Deep expertise in the ML infrastructure stack: PyTorch, TensorRT, TransformerEngine, Nsight, ONNX Runtime
Model compilation, quantization (INT8/FP16), and advanced serving architectures
Working knowledge of GPU hardware (NVIDIA) and the ability to dive deep into the stack as needed

Discover our full range of relocation jobs with comprehensive support packages to help you relocate and settle in your new location.

Strong understanding of transformer architectures and modern ML model optimization techniques

Logistics

We are based in-person in San Francisco. We believe the best ideas and work come from being together.

Benefits

• Competitive San Francisco salary and meaningful early equity.

• We sponsor visas. We are committed to working through the process together for the right candidates. If you're currently outside the US, we're also committed to helping you relocate to the US throughout this process.

• We offer generous health, dental, and vision coverage, and relocation support as needed.

Job Overview

Posted Date Mar 16, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location San Francisco Bay Area

Annual Salary 138 - 187 USD

Category Networking

Company reactor

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Enterprise Implementation Specialist

Networking

•

2w ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

ivo

San Francisco Bay Area

Director of Compensation and Total Rewards

Networking

•

2w ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

tessera data

San Francisco Bay Area

Senior Scaled Abuse Scientist

Networking

•

3w ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Discord

San Francisco Bay Area

Founding Engineer, ML Inference

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Enterprise Implementation Specialist

Premium Job

ivo

Director of Compensation and Total Rewards

Premium Job

tessera data

Senior Scaled Abuse Scientist

Premium Job

Discord

Subscribe our newsletter