Design, build, and maintain batch/streaming data pipelines. Write production-grade Python code. Own pipeline health and work autonomously.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Title: Senior Data Engineer - Healthcare AI
Location: Bulgaria or Europe | Remote (±2–3 hrs GMT overlap mandatory)
Reports to: Head of Engineering
Existing Clients: Top 100 Lifesciences, MedTech and Pharma companies
Type: Full-time
Core responsibilities & objectives
- Design, build, and maintain batch/streaming data pipelines, ingestion, cleaning, normalisation, enrichment, deduplication.
- Build and own ML/LLM pipelines end-to-end: document parsing, chunking, embeddings generation, vector indexing, agentic tool calling, multi-step workflows, retries, fallbacks, and state handling.
- Write production-grade, well-tested Python that processes large volumes of data and documents reliably.
- Own pipeline health: if data is stale, broken, or wrong, it's on you.
- Work autonomously to project deadlines with minimal hand-holding.
Interested in remote work opportunities in Data Science? Discover Data Science Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
Key qualifications & skills (non-negotiable)
- 7+ years in backend data-heavy development or data engineering.
- Highly proficient in Python
- Hands-on experience with large datasets and high-velocity data streams (Kafka, Flink, Spark).
- Strong with pipeline orchestration tools (Airflow, MLflow, or equivalent).
- Solid SQL skills (Postgres, BigQuery, or Snowflake) and NoSQL experience (DynamoDB, OpenSearch, Elastic).
- Real experience with LLM workflows: RAG architectures, embeddings/vector DBs, prompt engineering, function/tool calling, observability.
- Deep understanding of ETL/ELT patterns and data processing at scale.
Preferred background (strong signals)
- Experience with AWS data stack at scale.
- Exposure to healthcare, life sciences, or regulated industries.
- Built and shipped data, ML and LLM-powered pipelines in production.
- Has debugged a pipeline and knows why observability matters.
- Worked in a fast-moving startup where "that's not my job" doesn't exist.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
What will get you rejected
- "I set up the pipeline, someone else monitors it" mindset.
- Tutorials and side projects but no production experience at scale.
- Can't explain trade-offs between streaming vs. batch, or why you chose one vector DB over another.
- Needs detailed specs before writing a line of code.
- No curiosity about healthcare or what the data actually means.
Interested? We're a distributed team solving hard problems that will reshape the healthcare industry for a generation. If you want ownership, not just tickets, we'd like to hear from you.
Similar Jobs
Explore other opportunities that match your interests
Senior Workforce Management and Data Analytics Specialist
tabby | تابي
Podix