Senior AI Engineer for Knowledge Graph Development

Remote Relocation
Apply
AI Summary

We're hiring a Senior AI Engineer to develop a knowledge graph-backed RAG pipeline. The ideal candidate will have 5+ years of experience in shipping data/AI systems to production, strong Python and SQL skills, and experience with graph databases. The role involves designing and implementing a deterministic rule engine, data validation and quality, and live data operations.

Key Highlights
Design and implement a knowledge graph-backed RAG pipeline
Develop a deterministic rule engine
Implement data validation and quality checks
Key Responsibilities
Own the end-to-end pipeline that turns unstructured documents into a validated, queryable knowledge graph
Accountable for extraction quality, graph integrity, and the data layer that backs the product's read path
Develop a deterministic rule engine
Technical Skills Required
Python SQL PostgreSQL Neo4j Pydantic LLM pipeline Workflow orchestration Durable workflow orchestration
Benefits & Perks
Competitive salary: 32.000–42.000 € base
Remote, full-time with flexible scheduling
Possibility of relocation if successful work relationship is achieved after a period of time
Nice to Have
Experience with regulated, compliance-driven, or standards-heavy extraction domains
Designed deterministic evaluators alongside LLM components and knows when to reach for which
Contributions to data contracts, schema governance, or ontology work

Job Description


Pinnipedia is a new Berlin startup building a cloud platform that automates and assists the creation of audit-ready IT-security concepts (e.g., BSI-Grundschutz, C5). We’re IGP-funded (2025/26) and co-develop with FU Berlin and pilot users from industry and security consulting.


We’re hiring an AI Engineer to turn messy inputs into structured knowledge and reliable answers.


Your Mission -Own the end-to-end pipeline that turns unstructured documents into a validated, queryable knowledge graph. Accountable for extraction quality, graph integrity, and the data layer that backs the product's read path.


Tasks

LLM extraction pipelines -document chunking, property and relationship extraction, cross-chunk reconciliation, gap detection. Built with structured-output LLM agents orchestrated by durable workflows.


Knowledge graph -schema design as typed Pydantic models, Cypher access patterns and indexing strategy, graph operations, schema evolution and migration. Scope ends at the graph boundary: API contracts and query abstractions exposed to consumers belong to the full-stack engineer.


Deterministic rule engines -table-driven evaluators for cases where code beats LLM judgment; clear contracts between deterministic and probabilistic components.


Data validation & quality -schema enforcement, required-property contracts, audit trails, eval harnesses (expert review, unsupervised checks, synthetic fixtures, LLM-as-judge).


Live data ops -backfills, coordinated migrations across relational + graph stores, observability on extraction throughput and quality, incident response.


Requirements

Must-have



  • 5+ years shipping data/AI systems to production with real customers -has been on-call for live pipelines and knows what breaks at 2am.

  • Strong Python (typed, modern) and SQL. Comfortable with PostgreSQL under load.

  • Production experience with at least one graph database (Neo4j preferred; Neptune, ArangoDB, TigerGraph acceptable) -schema design, query tuning, not toy use.

  • Production LLM pipeline experience: structured output, agent orchestration, prompt and version management, evaluation frameworks. PydanticAI, LangChain, DSPy, or Instructor all welcome.

  • Durable workflow orchestration in production (DBOS, Temporal, Airflow, Prefect, Dagster).

  • Test-first discipline -integration tests against real datastores (Testcontainers or equivalent), not mock-heavy unit tests.

  • Fluent English skills.


Nice-to-have



  • Experience with regulated, compliance-driven, or standards-heavy extraction domains (legal, medical, financial, security/audit).

  • Designed deterministic evaluators alongside LLM components and knows when to reach for which.

  • Contributions to data contracts, schema governance, or ontology work.

  • German language skills.


Benefits

Remote, full-time with flexible scheduling. CET (Berlin) timezone availability expected.


Possibility of relocation if successfull work relationship is achieved after a period of time.


Competitive salary: 32.000–42.000 € base (premium for exceptional senior profiles).


Small, focused team; direct collaboration with the Product Owner and Full-Stack Engineer.


Modern tooling, real ownership, and a learning budget for role-relevant training.


Impact: help SMEs meet rising security requirements with less friction.


Apply on JOIN with your CV (PDF) and a short note (max 200 words) describing how you would design a KG-backed RAG pipeline (ontology scope, indexing, retrieval, and evaluation you’d use).

Process: 20-min intro → 90-min practical (graph modeling + retrieval evaluation) → 45-min team chat → references. We review applications within 5 business days.


Similar Jobs

Explore other opportunities that match your interests

Principal or Senior Applied Scientist

Programming
7h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Zalando

Germany
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Not Applicable

Aleph Alpha

Germany

EMEA Labour Relations Consultant

Programming
11h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Microsoft

Germany

Subscribe our newsletter

New Things Will Always Update Regularly