Join dnl.ai as a Senior LLM Engineer to design, ship, and run production LLM systems for AI-powered financial analysis. You'll work on tasks such as delivering reliable LLM systems, elevating quality and evaluation, and defining the future of LLM infrastructure. The ideal candidate will have experience with LLM systems, RAG pipelines, and LLMOps at scale.
Key Highlights
Technical Skills Required
Benefits & Perks
Job Description
Join us at DNL and help shape the backend of a truly developer-focused platform. We’re building Germany's leading AI-powered tools that analyse financial and non-financial reports. Using state-of-the-art machine learning technologies, we extract key figures from annual financial statements - fully automatically and transparently. We are hiring a builder who can design, ship, and run production LLM systems under real constraints.
- Stabilize and optimize our core LLM-based features for performance, reliability, and cost efficiency.
- Take ownership of RAG pipelines — chunking, retrieval, reranking, and attribution — ensuring that every answer is traceable, verifiable, and hallucination-resistant.Improve serving infrastructure for low-latency and scalable inference (vLLM/TGI, batching, caching, observability).
- Collaborate with Backend, Product, and QA to ship robust features into production and collect early feedback from real auditors.
- Build a systematic evaluation layer for offline and online quality tracking: golden sets, regression testing, human-in-the-loop red-teaming.
- Introduce clear metrics for groundedness, coverage, and faithfulness — and make them visible through dashboards and reports.
- Design context and prompt management systems with versioning, deterministic testing, and safety fallbacks. Collaborate with leadership to define LLMOps best practices — CI/CD for prompts and models, automated deployment pipelines, and clear SLOs on latency, accuracy, and cost.
- Architect the next generation of retrieval and reasoning systems for complex financial and ESG documents.
- Drive the vision for LLM orchestration — structured multi-turn flows, memory, and tool use that scale across product lines.
- Mentor other engineers and data scientists in applied LLM engineering and evaluation methodology.
- Contribute to open standards and tooling that make enterprise AI explainable and auditable.
- Work closely with company leadership to align long-term AI strategy with product and market goals.
- Shipped LLM systems to production — with real users, uptime, and feedback loops.
- Deep RAG experience — vector stores, hybrid lexical + dense retrieval, reranking, and source attribution.
- LLMOps at scale — Kubernetes, GPUs, vLLM or TGI, batching & caching, CI/CD for models and prompts, with metrics and tracing you actually look at.
- Evaluation mindset — dataset design, golden queries, offline & online metrics, and human-in-the-loop QA where it truly matters.
- Orchestration mastery — multi-turn flows, memory, tool use, and the judgment to go custom when frameworks get in the way.
- Strong engineering fundamentals — Python, FastAPI, clean APIs, large text pipelines, Postgres, Redis, vector DBs.
- Clear communication in English; German is a plus.
- Finance / audit exposure — annual reports, notes, XBRL, ESRS.
- Retrieval depth — Vespa or Elastic kNN, ColBERT or SPLADE, BM25 + dense hybrid retrieval, reranking at scale.
- Performance optimization — quantization, tensor parallelism, Triton kernels, flash attention, Ray Serve.
- Tooling familiarity — MLflow or W&B, Kafka, pgvector, Milvus, Weaviate, Qdrant.
- Product over paperwork — We ship fast, test in production, and learn by doing.
- Pilots, not passengers — Everyone codes, reviews, and deploys.
- Small, senior, autonomous team — You’ll have real scope, accountability, and impact.
- Infrastructure — Kubernetes, GPUs, Postgres, Redis, object storage, Grafana + Prometheus, GitHub Actions.
- Model & Serving — PyTorch, Hugging Face, vLLM / TGI, SKLearn, FastText.
- Application Layer — Python, FastAPI, vector DBs, Phoenix.
- Ops & Monitoring — MLflow / W&B, full tracing and dashboards.
- Model policy — We use open weights or APIs based on reliability, cost, and data sensitivity.
- Above-Average Compensation: We offer a competitive salary above market average, reflecting the impact and expertise we value as well as meaningful equity.
- Monthly Perks (Germany-based): If you're employed in Germany, you’ll receive a €50 monthly voucher usable at over 50 popular stores—covering everything from groceries to lifestyle.
- Learning budget: To foster your professional development, we provide financial support for conferences and continuing education courses.
- Innovative Work Culture: A collaborative startup environment with flat hierarchies, fast decisions, and space for your ideas.
- Great People: Work alongside an international team of passionate and driven professionals.
- Time Off: 30 vacation days per year to recharge and explore.
- Remote Flexibility: Work from anywhere within Europe and participate in optional Berlin meetups.
- Flexible Hours: Adapt your schedule to your personal rhythm and lifestyle.
- Top Equipment: We'll provide you with the latest hardware to do your best work.
We are looking forward to your application and getting to know you!
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.