Machine Learning Engineer - Large Language Model (LLM) and Retrieval-Augmented Generation (RAG)

Orbion Infotech • India

Remote

Apply

AI Summary

Design and implement end-to-end RAG solutions, optimize LLM inference performance, and deploy scalable infrastructure using Docker and orchestration platforms.

Key Highlights

Design and implement end-to-end RAG solutions

Optimize LLM inference performance

Deploy scalable infrastructure using Docker and orchestration platforms

Collaborate with Data Scientists and product teams

Integrate and tune vector search stacks

Technical Skills Required

Python Hugging Face Transformers PyTorch FAISS Milvus Weaviate Docker AWS GCP Azure ONNX Triton LangChain LlamaIndex MLflow Prometheus Grafana

Benefits & Perks

Fully remote role with flexible hours

Opportunity to work on cutting-edge LLM/RAG products

Collaborative, fast-paced engineering culture

Job Description

Machine Learning Engineer – LLM & RAG (Remote, India)

About The Opportunity

We operate in the AI/ML and Enterprise Software sector, building production-ready large language model (LLM) applications and retrieval-augmented generation (RAG) systems that solve real-world enterprise problems. The team focuses on scalable, low-latency LLM inference, vector search, and data pipelines to deliver intelligent search, summarization, and automated knowledge workflows for customers across industries.

Role & Responsibilities

Design and implement end-to-end RAG solutions: document ingestion, embedding generation, vector indexing, retriever design, and LLM-based response generation.
Develop and maintain Python back-end services and APIs that integrate LLMs, LangChain/LlamaIndex workflows, and vector search for production use.
Optimize LLM inference performance: model selection, batching, quantization, ONNX/Triton integration, and memory/GPU optimization to meet latency and cost SLAs.
Integrate and tune vector search stacks (FAISS, Milvus, Weaviate, or hosted vector DBs) and design embedding strategies for robust retrieval.
Deploy and operate scalable infrastructure using Docker and orchestration platforms; automate CI/CD, monitoring, and alerting for ML services.
Collaborate with Data Scientists and product teams to productionize models, implement A/B experiments, monitor drift, and iterate on model quality and UX.

Skills & Qualifications

Must-Have

4+ years of experience in machine learning or ML engineering with hands-on LLM projects.
Strong software engineering in Python and building production back-end services.
Experience with transformer frameworks and LLM tooling (Hugging Face Transformers, PyTorch).
Practical experience building RAG pipelines and working with vector search (FAISS or similar).
Proven experience deploying ML services with Docker and cloud environments (AWS/GCP/Azure).
Knowledge of model optimization and serving techniques (quantization, ONNX, Triton, batching).

Preferred

Hands-on experience with LangChain, LlamaIndex, or similar orchestration frameworks.
Familiarity with vector databases (Milvus, Weaviate) and managed vector DB services.
Experience with MLOps and monitoring tools (MLflow, Prometheus, Grafana, model-drift tooling).

Benefits & Culture Highlights

Fully remote role with flexible hours supporting work-life balance across India.
Opportunity to work on cutting-edge LLM/RAG products and influence architecture and tooling choices.
Collaborative, fast-paced engineering culture that values ownership, experimentation, and scalable design.

To apply, bring strong Python engineering, hands-on LLM/RAG experience, and a passion for shipping scalable AI systems. This role is ideal for engineers who enjoy end-to-end ownership of production ML services and optimizing LLMs for real user impact.

Skills: llm,rag,python

Job Overview

Posted Date Dec 02, 2025

Employment Type Full-time

Experience Level Mid-Senior level

Location India

Category Devops

Company Orbion Infotech

Machine Learning Engineer - Large Language Model (LLM) and Retrieval-Augmented Generation (RAG)

Key Highlights

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Machine Learning Engineer - Large Language Model (LLM) and Retrieval-Augmented Generation (RAG)

Key Highlights

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Subscribe our newsletter