Join Katonic AI to build LLM serving infrastructure and fine-tuning pipelines. Learn to work on cutting-edge systems deploying 250+ AI models. Contribute to production systems running inference requests and fine-tuning jobs.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
About the job
- Position: Product Engineer (AI Infrastructure)
- Location: Remote (India)
- Experience: 0–2 Years
- Type: Full-time
"We deployed a 70B parameter LLM for a government serving 115 million people. It runs entirely on their infrastructure. Zero data leaves their borders. That's not a demo - that's production."
Want to learn how to build systems like this? Keep reading.
About Katonic
We are a Sovereign Enterprise AI Company. Founded in Sydney in 2020, we've grown into a profitable global operation powering AI infrastructure for enterprises and governments across 11 countries. Our platform runs entirely within customer infrastructure - zero data egress, zero vendor lock-in.
Our platform - 250+ AI models, 80+ pre-built agents, ISO 27001 certified. Used by enterprises who report up to 80% increase in workflow efficiency.
Role Overview
We're hiring 2 Product Engineers for our AI Infrastructure team (internally called Adaptive Engine). This is an entry-level role where you'll learn to work on systems that deploy, serve, and fine-tune LLMs at enterprise scale. You'll start by learning, then quickly contribute to production systems that serve inference requests and run fine-tuning jobs for banks, governments, and Fortune 500 companies. If working on cutting-edge LLM infrastructure excites you, we should talk.
What You'll Work On
The Adaptive Engine is our LLM infrastructure for serving and fine-tuning. Here's what's under the hood:
- vLLM & SGLang: High-performance inference engines for LLMs
- NVIDIA NIM: Enterprise-grade model deployment
- Model Zoo: 250+ models - LLaMA, Mistral, DeepSeek, CodeLLaMA, and more
- Fine-tuning Pipeline: LoRA, QLoRA, full fine-tuning on customer data
- GPU Orchestration: Multi-tenant GPU allocation across Kubernetes clusters
- Auto-scaling: Handle traffic spikes without manual intervention
- Guardrails: Safety, compliance, and quality enforcement at inference time
Your Responsibilities
Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- Learn to deploy and test LLM serving infrastructure (vLLM, SGLang, NIM)
- Test fine-tuning pipelines - LoRA, QLoRA, and full fine-tuning workflows
- Run benchmarks - measure latency, throughput, memory usage, fine-tuning time
- Validate new models before production (LLaMA, Mistral, DeepSeek)
- Test GPU allocation and auto-scaling under real workloads
- Validate fine-tuned model quality against base models
- Identify and report inference failures, OOM errors, and performance issues
- Document deployment procedures and test results
- Learn from senior engineers while contributing from day one
Who You Are
Must Have:
• 0-2 years experience (fresh graduates with strong fundamentals welcome)
• Solid Python skills - you can write clean, working code
• Understanding of ML basics - what is a model, training vs inference, fine-tuning
• Familiarity with deep learning concepts (transformers, neural networks)
• Basic knowledge of Linux command line
• Exposure to Docker (even just tutorials or coursework)
• Curiosity about LLMs - you've played with ChatGPT, Claude, or open-source models
• Debugging mindset - you don't give up until you understand why something broke
Nice to Have:
• Coursework or projects in ML/deep learning
• Exposure to Hugging Face, PyTorch, or TensorFlow
• Understanding of fine-tuning concepts (LoRA, transfer learning)
• Basic understanding of APIs (REST)
• Personal projects deploying or fine-tuning ML models
• Familiarity with cloud platforms (AWS/GCP)
What You'll Become
In 12 months, you'll have skills most ML engineers don't:
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
• LLM serving expertise - vLLM, SGLang, NVIDIA NIM (rare and in-demand)
• Fine-tuning at scale - LoRA, QLoRA, full fine-tuning on enterprise data
• Production ML infrastructure at enterprise scale
• Hands-on with latest models the day they release (LLaMA 4, Mistral, DeepSeek)
• GPU optimization and Kubernetes orchestration
• Understanding of sovereign AI and compliance requirements
This is the launchpad for ML engineering, MLOps, or platform engineering roles at top AI companies.
Soft Skills
- Problem-solving mindset
- Strong communication skills
- Ownership and accountability
- Ability to learn fast and adapt to new technologies
What we offer
- Opportunity to work at the forefront of Generative AI and Agentic AI
- Fully remote - work from anywhere in India
- Health insurance
- Access to GPUs for learning and experimentation
- Mentorship from experienced ML engineers
- Learning budget for courses and certifications
- Global exposure - collaborate with teams in Sydney, Singapore, Dubai
Please apply only if you match the criteria.
To apply, please fill out this form: https://shorturl.at/3pQYv
Without filling out the form, your application is not complete.
Katonic AI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all.
Similar Jobs
Explore other opportunities that match your interests
Engineering Manager
Canonical
DevOps/SRE Engineer
BairesDev
DevOps Engineer