Senior AI DevOps & Infrastructure Engineer (Hybrid Cloud)

EPMA • India
Remote
Apply
AI Summary

EPMA seeks an AI DevOps & Infrastructure Engineer to build and maintain hybrid AI environments (AWS + on-prem DGX). This role involves automating infrastructure, managing Kubernetes, and implementing CI/CD pipelines. Focus on supporting LLM and RAG agent deployments within a collaborative, innovation-driven team.

Key Highlights
Build and maintain hybrid AI environments (AWS + on-prem DGX).
Automate infrastructure provisioning and manage Kubernetes clusters.
Implement CI/CD pipelines and monitor system performance.
Support deployment of LLMs, RAG agents, and model pipelines.
Collaborate with AI/ML engineers.
Technical Skills Required
Docker Kubernetes Terraform AWS (EC2, S3, IAM, KMS, VPCs) Helm CloudFormation GitHub Actions GitLab CI/CD Argo Prometheus Grafana ELK Stack RBAC IAM TLS Vault Airflow MLFlow Kubeflow Linux DNS VPN vLLM DeepSpeed TGI Pinecone Weaviate FAISS
Benefits & Perks
100% Remote
Professional growth opportunities
Impactful work

Job Description


EPMA is looking for an AI DevOps & Infrastructure Engineer for in-house project.


Title: AI DevOps & Infrastructure Engineer

Location: 100% Remote



Responsibilities:

  • Build and maintain hybrid AI environments (AWS + on-prem DGX)
  • Automate infrastructure provisioning with Terraform, Helm, or CloudFormation
  • Manage Kubernetes clusters, namespaces, and workload isolation
  • Implement CI/CD pipelines (GitHub Actions, GitLab CI/CD, Argo, etc.)
  • Monitor system performance with Prometheus, Grafana, ELK
  • Secure systems with RBAC, IAM, TLS, and Vault
  • Support deployment of LLMs, RAG agents, and model pipelines


Must-Have Skills:

  • 3+ years in DevOps, preferably supporting ML or AI workloads
  • Strong experience with Docker, Kubernetes, Terraform
  • Hands-on with AWS (EC2, S3, IAM, KMS, VPCs)
  • Experience supporting ML pipelines with Airflow, MLFlow, or Kubeflow
  • Proficiency in Linux server administration
  • Understanding of networking, DNS, and VPNs
  • Ability to work with AI/ML engineers collaboratively


Nice-to-Haves:

  • Familiarity with LLM deployment tools (vLLM, DeepSpeed, TGI, etc.)
  • Experience with vector databases (Pinecone, Weaviate, FAISS)
  • Security certifications or experience with GDPR, HIPAA, SOC2 compliance
  • Past experience supporting ethical AI projects or autonomous systems


Why Join EPMA?

As a people-first company with a sharp focus on innovation and AI, EPMA empowers team members to grow professionally while making a real impact. Join a team where your organizational superpowers are valued—and where your role plays a key part in driving operational excellence.


Subscribe our newsletter

New Things Will Always Update Regularly