AI Platform Engineer (LLM Infrastructure)

Akvelon, Inc. • Bosnia And Herzegovina

Remote

This Job is No Longer Active This position is no longer accepting applications

AI Summary

Akvelon is seeking an AI Platform Engineer to build and operate an internal AI platform for efficient AI-powered service delivery. The role focuses on Kubernetes-based infrastructure, LLM workflows, and developer experience. Requires strong Kubernetes, Terraform, and Python skills, with experience in MLOps/DevOps for AI workloads.

Key Highlights

Build and operate an internal AI platform for LLM-based services.

Focus on improving DevEx and reducing time-to-market for AI features.

Requires strong Kubernetes, Terraform, and Python expertise in an MLOps/DevOps context.

Key Responsibilities

Build and operate the AI platform infrastructure enabling developers to ship LLM-based services faster.

Implement and maintain Kubernetes-based runtime environments (incl. AKS) for AI workloads.

Manage infrastructure as code with Terraform (modules, environments, CI/CD automation).

Support LLM workflows: RAG, agents, prompt experimentation, evaluations, and deployment patterns.

Integrate and operate tooling such as Azure AI Foundry, LiteLLM, Langfuse, MLflow.

Orchestrate pipelines using Kubeflow Pipelines and/or Argo Workflows (build, deploy, evaluate).

Improve platform reliability and observability (monitoring, logging, tracing, cost/perf signals).

Collaborate closely with developers to streamline DX (APIs, templates, docs, golden paths, automation).

Technical Skills Required

Kubernetes AKS Terraform Python CI/CD Azure AI Foundry LiteLLM Langfuse MLflow Kubeflow Pipelines Argo Workflows

Nice to Have

Experience building internal developer platforms or “paved roads” for engineering teams.

Familiarity with LLM evaluation frameworks, prompt testing workflows, and LLM observability.

Exposure to RAG architectures, vector databases, and agentic patterns.

Experience with Kubeflow, Argo, and ML lifecycle tooling.

Job Description

This engagement is focused on building an internal AI platform that enables developers to ship AI-powered services efficiently. Scope includes model connectivity, prompt testing and evaluation, monitoring/observability, and the underlying AI infrastructure layer.

The objective is to improve DevEx and reduce time-to-market for AI features.

Tasks

Build and operate the AI platform infrastructure enabling developers to ship LLM-based services faster.

Implement and maintain Kubernetes-based runtime environments (incl. AKS) for AI workloads.

Manage infrastructure as code with Terraform (modules, environments, CI/CD automation).

Support LLM workflows: RAG, agents, prompt experimentation, evaluations, and deployment patterns.

Integrate and operate tooling such as Azure AI Foundry, LiteLLM, Langfuse, MLflow.

Orchestrate pipelines using Kubeflow Pipelines and/or Argo Workflows (build, deploy, evaluate).

Improve platform reliability and observability (monitoring, logging, tracing, cost/perf signals).

Collaborate closely with developers to streamline DX (APIs, templates, docs, golden paths, automation).

Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.

Requirements

Strong hands-on experience with Kubernetes in production (preferably AKS).

Solid Terraform expertise (IaC best practices, multi-env setups).

Practical experience supporting ML/LLM workloads in a platform or DevOps/MLOps context.

Proficiency in Python for automation, scripting, and supporting APIs/evaluation tooling.

Understanding of CI/CD, release processes, and production-grade operations.

Ability to work under tight timelines and deliver pragmatically.

Nice to Have

Experience building internal developer platforms or “paved roads” for engineering teams.

Familiarity with LLM evaluation frameworks, prompt testing workflows, and LLM observability.

Exposure to RAG architectures, vector databases, and agentic patterns.

Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.

Experience with Kubeflow, Argo, and ML lifecycle tooling.

Engagement Type

Long-term B2B contract.

Team

You will join a team of 5, with 3 AI Platform Engineers being added.

Location / Timezone

Remote within Europe (preferred: Croatia, Poland, Portugal, Serbia).

European working hours.

Occasionally available for meetings up to 10:00 AM PST (US overlap).

Job Overview

Posted Date Apr 14, 2026

Employment Type Contract

Experience Level Associate

Location Bosnia And Herzegovina

Category Devops

Company Akvelon, Inc.

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior Infrastructure Engineer - Azure & Terraform

Devops

•

1h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Brightwell

United State

Senior Cloud Network Engineer

Devops

•

2h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Bright Vision Technologies

United State

Senior DevSecOps Engineer

Devops

•

4h ago

Visa Sponsorship Relocation Remote

Job Type Contract

Experience Level Mid-Senior level

santcore technologies

United State

AI Platform Engineer (LLM Infrastructure)

Key Highlights

Key Responsibilities

Technical Skills Required

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior Infrastructure Engineer - Azure & Terraform

Brightwell

Senior Cloud Network Engineer

Bright Vision Technologies

Senior DevSecOps Engineer

santcore technologies

Subscribe our newsletter