AI Operations Engineer

pom • Egypt

Relocation

Apply

AI Summary

Manage a GPU-accelerated LLM inference platform, provision and maintain GPU servers, and deploy LLM inference engines. Strong NVIDIA GPU stack knowledge and Linux systems engineering experience required. 4+ years of experience in Linux systems engineering and 2+ years with GPU or ML/AI infrastructure.

Key Highlights

Manage a GPU-accelerated LLM inference platform

Provision and maintain GPU servers

Deploy LLM inference engines

Key Responsibilities

Provision and maintain GPU servers end-to-end

Deploy and operate LLM inference engines with multi-GPU sharding and quantization strategies

Manage an API gateway for load balancing, model routing, and per-application usage tracking

Own observability: GPU telemetry, latency metrics (p50/p95/p99), cost attribution, and alerting

Handle offline/air-gapped deployments with no internet dependency on production nodes

Benchmark new models, plan fleet capacity, and advise dev teams on prompt and parameter tuning

Support fine-tuning workflows (LoRA/QLoRA) and deploy fine-tuned models to production

Technical Skills Required

Linux systems engineering NVIDIA GPU stack Ansible shell scripting Python operational tooling containerisation service management database backends

Benefits & Perks

4+ years of experience in Linux systems engineering

2+ years with GPU or ML/AI infrastructure

Strong NVIDIA GPU stack knowledge

Nice to Have

Arabic NLP or multilingual model evaluation experience

Familiarity with MoE architectures or LLM API gateway/proxy solutions

Prior air-gapped or data-sovereign deployment experience

Job Description

Description:

Our client in KSA is seeking an AI Operations Engineer to manage a GPU-accelerated LLM inference platform. You'll own the full stack — from bare-metal provisioning to production model deployment, monitoring, and performance optimization. Open to relocation Candidates

Role:

Provision and maintain GPU servers end-to-end: OS hardening, NVIDIA drivers, inference engine deployment
Deploy and operate LLM inference engines with multi-GPU sharding and quantization strategies
Manage an API gateway for load balancing, model routing, and per-application usage tracking
Own observability: GPU telemetry, latency metrics (p50/p95/p99), cost attribution, and alerting

Looking to advance your Development & Programming career with relocation support? Explore Development & Programming Jobs with Relocation Packages that include comprehensive packages to help you move and settle in your new role.

Handle offline/air-gapped deployments with no internet dependency on production nodes
Benchmark new models, plan fleet capacity, and advise dev teams on prompt and parameter tuning
Support fine-tuning workflows (LoRA/QLoRA) and deploy fine-tuned models to production

Qualifications

4+ years in Linux systems engineering; 2+ years with GPU or ML/AI infrastructure
Hands-on experience deploying LLM inference engines in production
Strong NVIDIA GPU stack knowledge: drivers, toolkits, runtime libraries

Discover our full range of relocation jobs with comprehensive support packages to help you relocate and settle in your new location.

Proficient in Ansible (or similar IaC), shell scripting, and Python operational tooling
Solid networking fundamentals: reverse proxy, TLS, HTTP/SSE, load balancing
Experience with containerisation, service management, and database backends
Clear communicator; comfortable working independently in restricted-network environments

Nice to have

Arabic NLP or multilingual model evaluation experience
Familiarity with MoE architectures or LLM API gateway/proxy solutions
Prior air-gapped or data-sovereign deployment experience

Job Overview

Posted Date Apr 11, 2026

Employment Type Full-time

Experience Level Entry level

Location Egypt

Category Programming

Company pom

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Automation Engineer

Programming

•

3w ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

The Coca-Cola Company

Egypt

Web Developer

Programming

•

1h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Associate

fetchjobs.co

India

Engineering Manager, Full Stack (Revenue)

Programming

•

1h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Discord

San Francisco Bay Area

AI Operations Engineer

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Automation Engineer

Premium Job

The Coca-Cola Company

Web Developer

fetchjobs.co

Engineering Manager, Full Stack (Revenue)

Premium Job

Discord

Subscribe our newsletter