Senior Machine Learning Engineer

Lumicity • United State
Relocation
Apply
AI Summary

Design and optimize ML systems, build repeatable execution patterns, and troubleshoot scalability and reliability bottlenecks. Expertise in distributed systems, infrastructure as code, and systems fluency required. Collaborate with cross-functional teams to bridge hardware capabilities and developer experience.

Key Highlights
Architect Infrastructure
Orchestration & Reliability
Performance Engineering
Cross-Functional Partnership
Key Responsibilities
Architect Infrastructure
Orchestration & Reliability
Performance Engineering
Cross-Functional Partnership
Technical Skills Required
Distributed Systems Infrastructure as Code Kubernetes SLURM Linux Networking Stack GPU-Accelerated Environments
Benefits & Perks
Competitive industry salary
RSUs
100% paid insurance plans
PTO
Paid Holidays
401(k)
Paternity/Maternity Leave
FSAPaid Life Insurance
Mental Health Support

Job Description


*** This role is not a remote job; it is on-site. You must be willing to relocate for this role***



The Mission

We are representing a pioneering AI Infrastructure & Cloud Services firm dedicated to dismantling the barriers of large-scale AI innovation. Our client is creating seamless, resilient, and secure environments for the world’s builders.


The Role

As a Senior Machine Learning Engineer, you’ll be architecting and operating the core systems that power massive-scale distributed training and inference. You will sit at the intersection of workload orchestration, cluster operations, and performance engineering.


Core Responsibilities

  • Architect Infrastructure: Design and optimize ML systems that support massive distributed training and high-concurrency inference workloads.
  • Orchestration & Reliability: Build repeatable execution patterns across shared, high-density compute environments.
  • Performance Engineering: Troubleshoot and resolve complex scalability and reliability bottlenecks.
  • Cross-Functional Partnership: Collaborate with Systems and Platform teams to bridge the gap between hardware capabilities and developer experience.


Technical Profile

  • Expertise in Distributed Systems: Proven experience managing ML workloads across large-scale clusters.
  • Infrastructure as Code: Proficiency in orchestrating GPU-accelerated environments (Kubernetes, SLURM).
  • Systems Fluency: Deep understanding of the Linux networking stack, drivers, and low-level performance tuning.
  • Scale Mindset: Experience solving problems that only emerge when moving from "handfuls of devices" to massive, warehouse-scale compute.


Benefits

  • Competitive industry salary
  • RSUs
  • 100% paid insurance plans (medical, dental, vision)
  • PTO
  • Paid Holidays
  • 401(k)
  • Paternity/Maternity Leave
  • FSA
  • STDI
  • Life Insurance
  • Mental Health Support


Why This Role?

This isn't a "maintenance" job. Our client is solving problems that haven't been solved yet. You will be pushed daily to innovate and build the infrastructure that will define the next decade of AI development.


Agency Note: This position is exclusive and requires relocation due to the presence demanded by massive growth. We prioritize candidates who demonstrate a "builder" mindset and a deep commitment to scaling a promising company.


*** This role is not a remote job. You must be willing to relocate for this role***


Similar Jobs

Explore other opportunities that match your interests

Medicaid Enrollment Specialist

Machine Learning
•
3h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

hirenza

United State

Medicaid Enrollment Specialist

Machine Learning
•
21h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

agilegrid solutions

United State

AI/ML Engineering Intern

Machine Learning
•
1d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

sb telecom america corp.

United State

Subscribe our newsletter

New Things Will Always Update Regularly