Build core ML systems for AI agents, design and train models for information retrieval, entity resolution, classification, and structured data extraction. 3+ years of experience in NLP, information retrieval, or entity resolution required. Strong hands-on experience with Python and PyTorch.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Who is Recruiting from Scratch: Recruiting from Scratch is a specialized talent firm dedicated to helping companies build exceptional teams. We partner closely with our clients to deeply understand their needs, then connect them with top-tier candidates who are not only highly skilled but also the right fit for the company’s culture and vision. Our mission is simple: place the best people in the right roles to drive long-term success for both clients and candidates.https://www.recruitingfromscratch.com/
Title: Founding ML Engineer
Location: San Francisco, CA
Company Stage: Early-Stage (YC-backed, Profitable, High-Growth)
Office Type: Onsite
Salary: $150,000 – $300,000 + Equity (0.10% – 0.50%)
Company Description
This fast-growing, venture-backed startup is building the core infrastructure layer that enables AI agents to access, understand, and act on real-time internet data. Instead of traditional search workflows designed for humans, the platform provides APIs that allow AI systems to retrieve high-fidelity, structured data directly from source systems.
The company has achieved strong early traction—scaling to millions in ARR within its first year—and is already serving enterprise customers. Backed by leading investors including Y Combinator and top-tier venture firms, the team is now focused on pushing the boundaries of applied machine learning to power the next generation of AI-native data systems.
What You Will Do
Searching for Development & Programming roles that provide visa sponsorship? Connect with international employers through Development & Programming Jobs with Visa Sponsorship opportunities actively seeking talented professionals.
- Own the end-to-end development of core ML systems—from research and modeling to production deployment
- Design and train models for information retrieval, entity resolution, classification, and structured data extraction
- Build systems that transform messy, multilingual web-scale data into structured, queryable intelligence
- Develop embedding models, ranking systems, and retrieval pipelines for high-precision search and matching
- Apply transformer architectures and modern NLP techniques to real-world data problems
- Leverage LLMs for tasks such as extraction, classification, and data enrichment at scale
- Continuously evaluate and improve model performance using rigorous experimentation and metrics
- Work closely with engineering and product teams to integrate ML systems into production APIs
Ideal Background
- 3+ years of experience building and shipping production ML systems, particularly in NLP, information retrieval, or entity resolution
- Strong hands-on experience with Python and PyTorch
- Deep understanding of transformer architectures, including training and fine-tuning encoder models
- Experience building retrieval systems, classifiers, or embedding-based systems
- Familiarity with representation learning techniques (e.g., contrastive learning, metric learning)
- Experience applying LLMs to structured data problems (e.g., extraction, classification, generation)
- Strong problem-solving skills with the ability to work on ambiguous, large-scale data challenges
- High ownership mindset with a strong bias toward execution in fast-paced environments
Explore our comprehensive directory of visa sponsorship jobs from employers worldwide who are ready to sponsor talented international professionals.
Preferred
- Experience with entity resolution or record linkage at scale
- Background in multilingual or cross-lingual NLP
- Experience building taxonomies, ontologies, or knowledge systems
- Familiarity with distributed training on GPU clusters
- Experience scaling LLM inference pipelines in production
- Research publications or open-source contributions in NLP/IR
Interested in opportunities specifically in United State? Discover our dedicated Visa Sponsorship Jobs in United State page featuring roles from top employers in this location.
Compensation and Benefits
- Base salary: $150K – $300K
- Equity: 0.10% – 0.50% (founding-level ownership)
- Visa sponsorship available
- Opportunity to join at an early stage with strong product-market fit and rapid growth
- High ownership role with direct impact on core product and company trajectory
- Work alongside experienced founders and top-tier investors
This role is ideal for ML engineers who want to operate at the frontier of applied NLP and retrieval—owning core intelligence systems that power how AI agents interact with real-world data at scale.
Similar Jobs
Explore other opportunities that match your interests
Data Scientist, RD&E or Business Analytics
Corning Incorporated
Research Scientist - Artificial General Intelligence
horizon robotics
Software Engineer II - Neural Dynamics