Senior LLM Infrastructure Engineer

BeGig India
Remote
Apply
AI Summary

Design, deploy, and optimize cloud infrastructure for large language models. Collaborate with AI engineers and product teams. Ensure infrastructure security and cost-effectiveness.

Key Highlights
Design, deploy, and optimize cloud infrastructure for LLMs
Manage GPU/TPU clusters and containerized ML workloads
Implement monitoring, logging, and auto-scaling for high-availability AI systems
Collaborate with AI engineers, data scientists, and product teams
Ensure infrastructure security, compliance, and cost-effectiveness
Technical Skills Required
AWS GCP Azure Kubernetes Docker Ray MLflow Airflow Python Bash GPU/TPU clusters
Benefits & Perks
Fully remote work
Flexibility to choose projects that match expertise

Job Description


About BeGig

BeGig is the leading tech freelancing marketplace. We empower innovative, early-stage, non-tech founders to bring their visions to life by connecting them with top-tier freelance talent. By joining BeGig, you’re not just taking on one role—you’re signing up for a platform that will continuously match you with high-impact opportunities tailored to your expertise.


Your Opportunity

Join BeGig as an LLM Infrastructure Engineer and power the next wave of AI-driven applications. You’ll architect and scale infrastructure for large language models (LLMs), supporting inference, fine-tuning, and seamless integration in production systems.


Role Overview

As an LLM Infrastructure Engineer, you will:

  • Design, deploy, and optimize cloud infrastructure for training and serving LLMs at scale.
  • Manage GPU/TPU clusters and containerized ML workloads (Kubernetes, Docker, Ray, etc.).
  • Implement monitoring, logging, and auto-scaling for high-availability AI systems.
  • Collaborate with AI engineers, data scientists, and product teams to productionize LLM workflows.
  • Ensure infrastructure security, compliance, and cost-effectiveness.


Technical Requirements & Skills

  • Experience: 3+ years in ML infrastructure, MLOps, or cloud engineering.
  • Platforms: Deep experience with AWS, GCP, or Azure for ML workloads.
  • Orchestration: Familiarity with Kubernetes, Docker, MLflow, Ray, or Airflow.
  • LLMs: Understanding of LLM architectures (e.g., GPT, Gemini, Llama), deployment, and scaling challenges.
  • Scripting: Strong Python or Bash skills for automation and pipeline management.


What We’re Looking For

  • An engineer passionate about building the backbone of modern AI systems.
  • Proactive in optimizing performance, reliability, and cost at every layer.
  • A strong collaborator who communicates clearly across AI, infra, and product teams.


Why Join Us?

  • Impact: Build the infrastructure powering next-gen AI products and services.
  • Flexibility: Choose projects that match your expertise—work fully remote.
  • Community: Be a key part of a fast-growing AI and engineering talent marketplace.



Subscribe our newsletter

New Things Will Always Update Regularly