Senior Data Engineer - Healthcare AI

vamstar Bulgaria
Remote
Apply
AI Summary

Design, build, and maintain data pipelines for healthcare companies. Develop and own ML/LLM pipelines. Work autonomously with minimal hand-holding.

Key Highlights
Design, build, and maintain batch/streaming data pipelines
Develop and own ML/LLM pipelines
Work autonomously with minimal hand-holding
Key Responsibilities
Design, build, and maintain batch/streaming data pipelines, ingestion, cleaning, normalisation, enrichment, deduplication
Build and own ML/LLM pipelines end-to-end: document parsing, chunking, embeddings generation, vector indexing, agentic tool calling, multi-step workflows, retries, fallbacks, and state handling
Write production-grade, well-tested Python that processes large volumes of data and documents reliably
Own pipeline health: if data is stale, broken, or wrong, it's on you
Technical Skills Required
Python Kafka Flink Spark Airflow MLflow Postgres BigQuery Snowflake DynamoDB OpenSearch Elastic RAG architectures embeddings/vector DBs prompt engineering function/tool calling observability
Benefits & Perks
Remote work
Full-time employment
Nice to Have
Experience with AWS data stack at scale
Exposure to healthcare, life sciences, or regulated industries
Built and shipped data, ML and LLM-powered pipelines in production
Has debugged a pipeline and knows why observability matters
Worked in a fast-moving startup where 'that's not my job' doesn't exist

Job Description


Title: Senior Data Engineer - Healthcare AI

Location: Bulgaria or Europe | Remote (±2–3 hrs GMT overlap mandatory)

Reports to: Head of Engineering

Existing Clients: Top 100 Lifesciences, MedTech and Pharma companies

Type: Full-time


Core responsibilities & objectives

  • Design, build, and maintain batch/streaming data pipelines, ingestion, cleaning, normalisation, enrichment, deduplication.
  • Build and own ML/LLM pipelines end-to-end: document parsing, chunking, embeddings generation, vector indexing, agentic tool calling, multi-step workflows, retries, fallbacks, and state handling.
  • Write production-grade, well-tested Python that processes large volumes of data and documents reliably.
  • Own pipeline health: if data is stale, broken, or wrong, it's on you.
  • Work autonomously to project deadlines with minimal hand-holding.


Key qualifications & skills (non-negotiable)

  • 7+ years in backend data-heavy development or data engineering
  • Previously worked in Startup
  • Highly proficient in Python
  • Hands-on experience with large datasets and high-velocity data streams (Kafka, Flink, Spark).
  • Strong with pipeline orchestration tools (Airflow, MLflow, or equivalent).
  • Solid SQL skills (Postgres, BigQuery, or Snowflake) and NoSQL experience (DynamoDB, OpenSearch, Elastic).
  • Real experience with LLM workflows: RAG architectures, embeddings/vector DBs, prompt engineering, function/tool calling, observability.
  • Deep understanding of ETL/ELT patterns and data processing at scale.


Preferred background (strong signals)

  • Experience with AWS data stack at scale.
  • Exposure to healthcare, life sciences, or regulated industries.
  • Built and shipped data, ML and LLM-powered pipelines in production.
  • Has debugged a pipeline and knows why observability matters.
  • Worked in a fast-moving startup where "that's not my job" doesn't exist.


What will get you rejected

  • "I set up the pipeline, someone else monitors it" mindset.
  • Tutorials and side projects but no production experience at scale.
  • Can't explain trade-offs between streaming vs. batch, or why you chose one vector DB over another.
  • Needs detailed specs before writing a line of code.
  • No curiosity about healthcare or what the data actually means.


Interested? We're a distributed team solving hard problems that will reshape the healthcare industry for a generation. If you want ownership, not just tickets, we'd like to hear from you.


Similar Jobs

Explore other opportunities that match your interests

Business Analyst (Technology and System Implementation)

Data Science
7h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

TEKsystems

United State

Senior Data Professional

Data Science
7h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

opareta

Kenya
Visa Sponsorship Relocation Remote
Job Type Internship
Experience Level Internship

scaleup.agency

Austria

Subscribe our newsletter

New Things Will Always Update Regularly