AI Inference Engineer

Jobgether • Eastern Region
Remote
Apply
AI Summary

Jobgether is seeking an AI Inference Engineer to design, optimize, and maintain the inference layer for high-performance AI execution on edge devices. The ideal candidate will have a strong foundation in systems programming and machine learning, with expertise in C++ and experience with inference frameworks. This is an opportunity to work on cutting-edge AI and decentralized technologies, with a focus on low-level optimization and architecture.

Key Highlights
Design and optimize the inference layer for high-performance AI execution on edge devices
Collaborate with research teams to transition models from experimentation to production-ready deployments
Integrate AI-driven features into existing products, ensuring seamless performance and reliability
Key Responsibilities
Design, optimize, and maintain the inference layer for high-performance AI execution on edge devices
Develop and optimize C++-based inference systems for deploying AI models on edge devices
Collaborate with research teams to transition models from experimentation to production-ready deployments
Technical Skills Required
C++ JavaScript Llama.cpp GGML ONNX Deep learning concepts Transformers LLMs Diffusion models
Benefits & Perks
Fully remote, globally distributed work environment
Opportunity to work on cutting-edge AI and decentralized technologies
High ownership and impact on core product infrastructure

Job Description


This position is posted by Jobgether on behalf of a partner company. We are currently looking for a AI Inference Engineer QVAC in Saudi Arabia.

This role offers a unique opportunity to work at the cutting edge of on-device AI, building the core systems that power fast, private, and reliable inference on real-world hardware. You will operate close to the metal, designing and optimizing the runtime layer that enables machine learning models to perform efficiently without relying on cloud infrastructure. The position sits at the intersection of systems engineering and AI, where performance, stability, and scalability are critical. You will collaborate with researchers and product teams to bring advanced models into production environments. With a strong focus on low-level optimization and architecture, your work will directly shape the future of decentralized, peer-to-peer AI experiences. This is an ideal role for engineers who enjoy deep technical challenges and ownership of core infrastructure.

Accountabilities

In this role, you will be responsible for designing, optimizing, and maintaining the inference layer that enables high-performance AI execution on edge devices. You will ensure systems are robust, efficient, and scalable across diverse hardware environments.

  • Develop and optimize C++-based inference systems for deploying AI models on edge devices.
  • Enhance and adapt inference engines such as llama.cpp, ggml, and ONNX for improved performance and compatibility.
  • Improve runtime efficiency, focusing on memory usage, latency, throughput, and long-session stability.
  • Collaborate with research teams to transition models from experimentation to production-ready deployments.
  • Define and maintain core abstractions that support scalable and maintainable inference capabilities.
  • Integrate AI-driven features into existing products, ensuring seamless performance and reliability.
  • Continuously evaluate and implement new technologies to improve system capabilities and efficiency.

Requirements

You are a highly skilled engineer with a strong foundation in systems programming and machine learning, capable of working on complex, performance-critical AI infrastructure.

  • Strong programming expertise in C++, with additional experience in JavaScript considered a plus.
  • Proven experience with inference frameworks such as llama.cpp, ggml, ONNX, or similar technologies.
  • Solid understanding of deep learning concepts, including transformers, LLMs, and diffusion models.
  • Experience deploying and optimizing machine learning models on edge devices or constrained environments.
  • Ability to quickly learn and apply new technologies in a fast-evolving AI landscape.
  • Strong problem-solving skills with attention to performance, scalability, and reliability.
  • Degree in Computer Science, AI, Machine Learning, or a related field, or equivalent practical experience.

Benefits

  • Fully remote, globally distributed work environment
  • Opportunity to work on cutting-edge AI and decentralized technologies
  • High ownership and impact on core product infrastructure
  • Collaboration with top talent in AI, systems engineering, and fintech
  • Dynamic, fast-paced environment focused on innovation and experimentation
  • Exposure to advanced AI frameworks and next-generation product development
  • Competitive compensation aligned with experience and expertise

How Jobgether Works

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.


Similar Jobs

Explore other opportunities that match your interests

AI Automation Engineer

Programming
•
1w ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Entry level

Jobgether

Eastern Region

Developer Relations AI

Programming
•
2w ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Jobgether

Eastern Region

Co-Founder & CTO

Programming
•
3m ago
Visa Sponsorship Relocation Remote
Job Type Other
Experience Level Executive

cloudduty

United State

Subscribe our newsletter

New Things Will Always Update Regularly