Design and implement LLM services and pipelines on GCP using Python. Develop robust MLOps and vector search pipelines, integrate and optimize transformer models, and collaborate with Data Scientists and Product teams.
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
A remote opportunity in the Enterprise AI & Cloud Platform engineering sector focused on building production-grade LLM-powered services and ML pipelines on Google Cloud. This role drives GenAI features — vector search, prompt engineering, inference optimization, and MLOps — for B2B and consumer-facing applications. Location: Remote (India). Primary title: Senior AI/ML Engineer - LLM & GCP (Python).
Role & Responsibilities
- Design, implement and productionize Python-based LLM services and microservices on GCP (Vertex AI) to support real-time and batch inference workloads.
- Integrate and optimize transformer models (Hugging Face) and LangChain-based workflows for retrieval-augmented generation and agentic pipelines.
- Build and maintain vector search pipelines (FAISS or managed alternatives), embedding stores, and low-latency query paths for semantic search.
- Develop robust MLOps: containerization, CI/CD, infra-as-code, autoscaling, monitoring and rollback procedures for model deployments on Kubernetes.
- Collaborate with Data Scientists and Product teams to define model evaluation, A/B testing, inference cost/perf trade-offs, and prompt engineering best practices.
- Implement secure, observable APIs and backend services (FastAPI/REST) with logging, tracing and SLIs to ensure production reliability.
- Python
- Hugging Face Transformers
- LangChain
- Google Cloud Platform
- Vertex AI
- Docker
- Kubernetes
- REST APIs
Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- FAISS
- PyTorch
- Apache Airflow
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Approximately 5 years of hands-on experience building and shipping production ML/LLM services, with demonstrable projects or deployed systems.
- Strong practical knowledge of model inference optimization, prompt engineering, cost/performance tuning, and production monitoring.
- Fully remote role (India) with flexible hours and a fast-paced, learning-focused engineering culture.
- Opportunities to work end-to-end on GenAI product features and influence MLOps best practices across teams.
- Professional development budget and access to cutting-edge ML tooling and cloud credits.
Similar Jobs
Explore other opportunities that match your interests
sesheng company
Antal International