Data Engineer

Stealth Startup • United State
Visa Sponsorship Relocation
Apply
AI Summary

Design and build scalable data ingestion pipelines, develop data enrichment layers, and implement data quality monitoring and validation. Collaborate with AI/ML and research teams to identify and integrate new data sources. Own the data infrastructure that serves enriched datasets to the simulation pipeline.

Key Highlights
Design and build scalable data ingestion pipelines
Develop data enrichment layers
Implement data quality monitoring and validation
Key Responsibilities
Design and build scalable data ingestion pipelines
Develop data enrichment layers
Implement data quality monitoring and validation
Collaborate with AI/ML and research teams
Technical Skills Required
SQL OLTP analytical databases
Benefits & Perks
Competitive base salary
Equity participation
Comprehensive medical, vision, and dental coverage
Visa sponsorship
Relocation support

Job Description


We provide organizations with invaluable foresight, empowering them to anticipate outcomes and proactively make the right decisions at the right time, every time.

We're a small, dedicated, mission-driven team and we intend to stay that way. We believe the best work happens when exceptionally talented people are given ownership, trust and the space to operate without bureaucratic friction. We work with urgency and intellectual honesty and expect new team members to match our velocity. We seek individuals who thrive at the frontier, who push beyond conventional limits, who bring curiosity and conviction in equal measure, and who want their work to have demonstrable impact in the world. If you're energized by the idea of a small team doing things that feel impossible, let’s build together.


ABOUT THE ROLE

As a Data Engineer, you'll build and scale the data acquisition and enrichment infrastructure that makes our simulations accurate. The Data team owns the pipelines that ingest, process, and serve the diverse real-world data sources our simulation engine depends on — from public demographic datasets to proprietary consumer behavior signals. You'll work on turning messy, heterogeneous data into clean, structured inputs that power every simulation we run.


RESPONSIBILITIES

  • Design and build scalable data ingestion pipelines for diverse sources: public datasets (census, labor statistics), licensed proprietary data, and web-scraped sources
  • Develop the data enrichment layer that joins location-level behavioral data with demographic profiles, workplace characteristics, and consumer behavior markers
  • Build and maintain systems for processing unique data types — foot-traffic patterns, cross-shopping behavior, and trade area demographics
  • Implement data quality monitoring and validation to ensure incoming data meets accuracy thresholds before it reaches the simulation engine
  • Collaborate with AI/ML and research teams to identify and integrate new data sources that improve simulation fidelity
  • Own the data infrastructure that serves enriched datasets to the simulation pipeline at the speed and reliability production demands

YOU MAY BE A FIT IF

  • You've built production data pipelines that ingest and process data from multiple heterogeneous sources at scale
  • You have experience working with geospatial data, census datasets, or similar public/proprietary data sources
  • You care deeply about data quality and have built systems to detect, flag, and remediate data issues automatically
  • You're comfortable with the full data lifecycle: acquisition, cleaning, transformation, storage, and serving
  • You have strong SQL skills and experience with both OLTP and analytical databases
  • You can work independently to scope, plan, and execute data infrastructure projects

STRONG CANDIDATES MAY ALSO

  • Have experience with geospatial processing (reverse geocoding, census block mapping, trade area analysis)
  • Have built ETL/ELT pipelines for alternative data (foot traffic, mobility, transaction data)
  • Have worked with imputation techniques for handling missing or sparse data
  • Have familiarity with demographic modeling or population statistics

LOCATION

This role is based in New York City. This is an in-person company and during this exciting period of hypergrowth, we work 6 days a week in office. Candidates are expected to be located within the New York City metropolitan area or open to relocation.


BENEFITS

We take care of our people. In addition to a competitive base salary and equity participation, we offer comprehensive medical, vision, and dental coverage, visa sponsorship and relocation support, and various other benefits and perks.


Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

systems technology group, inc....

United State

Director of Admissions

Programming
•
2h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

uhs physician careers

United State

Flight Software Platform Engineer

Programming
•
2h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Anduril Industries

United State

Subscribe our newsletter

New Things Will Always Update Regularly