DevOps Engineer

akkadian labs • United State
Remote
Apply
AI Summary

Akkadian Labs is seeking a DevOps Engineer to support the design, implementation, and maintenance of scalable and secure infrastructure and DevOps processes. The ideal candidate will have 5+ years of experience in DevOps, Site Reliability Engineering, or a related role. Key responsibilities include infrastructure and environment management, AI and agent infrastructure implementation, and observability and reliability.

Key Highlights
Support design, implementation, and maintenance of scalable and secure infrastructure and DevOps processes
Work with development, QA, and product teams to enable reliable deployments and automate workflows
Implement AI and agent infrastructure and manage AI model deployment pipelines
Key Responsibilities
Support deployment and maintenance of scalable infrastructure in AWS and hybrid cloud environments
Assist in managing infrastructure-as-code (IaC) using Terraform, CloudFormation, or similar tools
Help maintain Linux-based environments
Contribute to containerization efforts using Docker and orchestration via Kubernetes
Work on the design, deployment and management of AI agent workloads
Play a key role in building and maintaining model deployment pipelines
Monitor AI API consumption and infrastructure costs
Coordinate the implementation of infrastructure-level security guardrails for AI systems
Manage monitoring and observability efforts using tools such as Prometheus, Grafana, and the ELK stack
Troubleshoot system issues and contribute to incident response and root cause analysis
Develop and execute strategies for improving system reliability, performance, and uptime
Build, maintain, and optimize CI/CD pipelines using tools such as Jenkins, BitBucket CI/CD, or similar
Automate routine operational tasks including builds, testing, deployments, and system updates
Work with engineering teams to integrate pipelines with Akkadian tools
Follow secure DevOps practices and assist in implementing security controls
Support compliance initiatives and vulnerability remediation efforts
Work closely with DevOps, engineering, QA, and product teams to support deployments and releases
Maintain documentation for infrastructure, processes, and operational procedures
Participate in team ceremonies and continuous improvement initiatives
Technical Skills Required
AWS Linux Docker Kubernetes Terraform CloudFormation Python Bash Jenkins BitBucket CI/CD Prometheus Grafana ELK
Benefits & Perks
Fully remote environment
Competitive benefits package including medical, dental, vision, company-paid life insurance and disability policies, 401(k) with a generous matching program, and paid time off
Nice to Have
Experience supporting AI or machine learning workloads
Exposure to AI model deployment pipelines and model versioning practices
Experience with infrastructure-as-code tools such as Terraform or CloudFormation
Familiarity with hybrid cloud or on-premises environments
Exposure to security best practices in DevOps contexts, including AI-specific concerns such as data isolation and access controls

Job Description


Who We Are

Akkadian Labs is a Collaboration Lifecycle Automation Platform that services some of the largest global enterprises and government agencies. Our platform currently manual work by up to 90% and costs by as much as 50%, while improving accuracy and governance across leading Unified Communications platforms.

This ability to innovate at scale defines who we are: big enough to compete at the highest level, yet agile enough to stay ahead of a rapidly evolving market. Our culture is people-first, fully remote, and rooted in respect, innovation, and teamwork, because when our people thrive, so do our customers.

Who You Are

The DevOps Engineer will support the design, implementation, and maintenance of scalable and secure infrastructure and DevOps processes at Akkadian Labs. You will work with development, QA, and product teams to enable reliable deployments, automate workflows, and improve system observability across Rocky OS-based, AWS-hosted, and on-premises solutions.

This is a hands-on technical role focused on execution, continuous improvement, and operational excellence within the DevOps function led by the DevOps Manager.

Key Responsibilities

Infrastructure and Environment Management

  • Support deployment and maintenance of scalable infrastructure in AWS and hybrid cloud environments.
  • Assist in managing infrastructure-as-code (IaC) using Terraform, CloudFormation, or similar tools.
  • Help maintain Linux-based environments.
  • Contribute to containerization efforts using Docker and orchestration via Kubernetes.
  • AI and Agent Infrastructure Implementation & Support
  • Work on the design, deployment and management of AI agent workloads, including provisioning compute instances and managing resource scaling for inference-heavy tasks.
  • Play a key role in building and maintaining model deployment pipelines, including versioning, testing, and rollback of AI models in production environments.
  • Monitor AI API consumption and infrastructure costs, implementing alerting and controls to prevent runaway usage and support budget visibility.
  • Coordinate the implementation of infrastructure-level security guardrails for AI systems, including access controls and data isolation for model inputs and outputs.

Observability and Reliability

  • Manage monitoring and observability efforts using tools such as Prometheus, Grafana, and the ELK stack.
  • Troubleshoot system issues and contribute to incident response and root cause analysis.
  • Develop and execute strategies for improving system reliability, performance, and uptime.

CI/CD and Automation

  • Build, maintain, and optimize CI/CD pipelines using tools such as Jenkins, BitBucket CI/CD, or similar.
  • Automate routine operational tasks including builds, testing, deployments, and system updates.
  • Work with engineering teams to integrate pipelines with Akkadian tools.

Security and Compliance

  • Follow secure DevOps practices and assist in implementing security controls.
  • Support compliance initiatives and vulnerability remediation efforts.

Collaboration and Documentation

  • Work closely with DevOps, engineering, QA, and product teams to support deployments and releases.
  • Maintain documentation for infrastructure, processes, and operational procedures.
  • Participate in team ceremonies and continuous improvement initiatives.


Requirements

Required Qualifications 

  • Experience: 5+ years of experience in DevOps, Site Reliability Engineering (SRE), or a related role. 
  • Cloud Expertise: Hands-on experience with AWS (e.g., EC2, ECS, S3, IAM, Lambda, CloudWatch). 
  • Linux Knowledge: Working knowledge of Linux environments. 
  • Containerization: Familiarity with Docker and Kubernetes. 
  • Scripting: Basic to intermediate scripting ability in Python, Bash, or similar languages. 
  • CI/CD: Experience building or maintaining CI/CD pipelines and related tools. 
  • Observability: Exposure to monitoring and observability tools such as Prometheus, Grafana, and ELK. 
  • Security: Understanding of secure DevOps practices and basic compliance concepts. 

Preferred Qualifications 

  • Experience supporting AI or machine learning workloads, compute environments. 
  • Exposure to AI model deployment pipelines and model versioning practices. 
  • Experience with infrastructure-as-code tools such as Terraform or CloudFormation. 
  • Familiarity with hybrid cloud or on-premises environments. 
  • Exposure to security best practices in DevOps contexts, including AI-specific concerns such as data isolation and access controls. 
  • Experience supporting production systems and participating in on-call rotations. 


Benefits

We offer a fully remote environment, plus a competitive benefits package including medical, dental, vision, company-paid life insurance and disability policies, 401(k) with a generous matching program, and paid time off.


Similar Jobs

Explore other opportunities that match your interests

Technical Support Engineer

Devops
•
4h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Converge Resources

United State

Staff Site Reliability Engineer (Azure)

Devops
•
11h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Visa

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Nava

United State

Subscribe our newsletter

New Things Will Always Update Regularly