Senior LLM Engineer - Full Stack (Python, GCP)

Remote
Apply
AI Summary

Design and implement LLM services and pipelines on GCP using Python. Develop robust MLOps and vector search pipelines, integrate and optimize transformer models, and collaborate with Data Scientists and Product teams.

Key Responsibilities
Design, implement and productionize LLM-powered services and microservices on GCP (Vertex AI) to support real-time and batch inference workloads.
Integrate and optimize transformer models (Hugging Face) and LangChain-based workflows for retrieval-augmented generation and agentic pipelines.
Build and maintain vector search pipelines (FAISS or managed alternatives), embedding stores, and low-latency query paths for semantic search.
Technical Skills Required
Python Hugging Face Transformers LangChain Google Cloud Platform Vertex AI Docker Kubernetes REST APIs FAISS PyTorch
Benefits & Perks
Fully remote role (India)
Nice to Have
FAISS
PyTorch
Apache Airflow

Job Description


A remote opportunity in the Enterprise AI & Cloud Platform engineering sector focused on building production-grade LLM-powered services and ML pipelines on Google Cloud. This role drives GenAI features — vector search, prompt engineering, inference optimization, and MLOps — for B2B and consumer-facing applications. Location: Remote (India). Primary title: Senior AI/ML Engineer - LLM & GCP (Python).

Role & Responsibilities

  • Design, implement and productionize Python-based LLM services and microservices on GCP (Vertex AI) to support real-time and batch inference workloads.
  • Integrate and optimize transformer models (Hugging Face) and LangChain-based workflows for retrieval-augmented generation and agentic pipelines.
  • Build and maintain vector search pipelines (FAISS or managed alternatives), embedding stores, and low-latency query paths for semantic search.
  • Develop robust MLOps: containerization, CI/CD, infra-as-code, autoscaling, monitoring and rollback procedures for model deployments on Kubernetes.
  • Collaborate with Data Scientists and Product teams to define model evaluation, A/B testing, inference cost/perf trade-offs, and prompt engineering best practices.
  • Implement secure, observable APIs and backend services (FastAPI/REST) with logging, tracing and SLIs to ensure production reliability.

Skills & Qualifications Must-Have

  • Python
  • Hugging Face Transformers
  • LangChain
  • Google Cloud Platform
  • Vertex AI
  • Docker
  • Kubernetes
  • REST APIs

Preferred

  • FAISS
  • PyTorch
  • Apache Airflow

Qualifications

  • Approximately 5 years of hands-on experience building and shipping production ML/LLM services, with demonstrable projects or deployed systems.
  • Strong practical knowledge of model inference optimization, prompt engineering, cost/performance tuning, and production monitoring.

Benefits & Culture Highlights

  • Fully remote role (India) with flexible hours and a fast-paced, learning-focused engineering culture.
  • Opportunities to work end-to-end on GenAI product features and influence MLOps best practices across teams.
  • Professional development budget and access to cutting-edge ML tooling and cloud credits.

Skills: gcp,llm,python

Similar Jobs

Explore other opportunities that match your interests

Senior Gen AI Engineer

Devops
2d ago
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Not Applicable

sesheng company

India
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Antal International

India
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Not Applicable

Andela

India

Subscribe our newsletter

New Things Will Always Update Regularly