EPMA seeks an AI DevOps & Infrastructure Engineer to build and maintain hybrid AI environments (AWS + on-prem DGX). This role involves automating infrastructure, managing Kubernetes, and implementing CI/CD pipelines. Focus on supporting LLM and RAG agent deployments within a collaborative, innovation-driven team.
Key Highlights
Technical Skills Required
Benefits & Perks
Job Description
EPMA is looking for an AI DevOps & Infrastructure Engineer for in-house project.
Title: AI DevOps & Infrastructure Engineer
Location: 100% Remote
Responsibilities:
- Build and maintain hybrid AI environments (AWS + on-prem DGX)
- Automate infrastructure provisioning with Terraform, Helm, or CloudFormation
- Manage Kubernetes clusters, namespaces, and workload isolation
- Implement CI/CD pipelines (GitHub Actions, GitLab CI/CD, Argo, etc.)
- Monitor system performance with Prometheus, Grafana, ELK
- Secure systems with RBAC, IAM, TLS, and Vault
- Support deployment of LLMs, RAG agents, and model pipelines
Must-Have Skills:
- 3+ years in DevOps, preferably supporting ML or AI workloads
- Strong experience with Docker, Kubernetes, Terraform
- Hands-on with AWS (EC2, S3, IAM, KMS, VPCs)
- Experience supporting ML pipelines with Airflow, MLFlow, or Kubeflow
- Proficiency in Linux server administration
- Understanding of networking, DNS, and VPNs
- Ability to work with AI/ML engineers collaboratively
Nice-to-Haves:
- Familiarity with LLM deployment tools (vLLM, DeepSpeed, TGI, etc.)
- Experience with vector databases (Pinecone, Weaviate, FAISS)
- Security certifications or experience with GDPR, HIPAA, SOC2 compliance
- Past experience supporting ethical AI projects or autonomous systems
Why Join EPMA?
As a people-first company with a sharp focus on innovation and AI, EPMA empowers team members to grow professionally while making a real impact. Join a team where your organizational superpowers are valued—and where your role plays a key part in driving operational excellence.