Lead the highest escalation layer for critical infrastructure incidents across global datacenters. Build and lead the L3 support team across regions. Design and enforce incident response and escalation frameworks.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Datacenter Infrastructure Reliability Lead
Location: Amsterdam, Netherlands โ Hybrid (50/50 in office)
Relocation: possible and supported.
Compensation: Up to 200k base + 25% bonus + RSUs
Join a fast-growing AI infrastructure company building large-scale GPU and datacenter platforms from the ground up. This role is ideal for experienced infrastructure leaders who enjoy solving complex production issues, building teams from scratch, and operating at the intersection of hardware, Linux systems, and large-scale datacenter operations.
You will lead the highest escalation layer for critical infrastructure incidents across global datacenters.
Looking to advance your Development & Programming career with relocation support? Explore Development & Programming Jobs with Relocation Packages that include comprehensive packages to help you move and settle in your new role.
Role Overview
Your team will be the final escalation point for anything related to datacenter IT hardware infrastructure (modern servers, GPUs, racks, networking, storage, etc.). Anything L2 cannot resolve will be escalated to this team.
This L3 team is not yet in place โ you will be responsible for building and leading it from scratch.
Responsibilities
- Build, lead, and scale the L3 support team across regions, with full ownership of hiring, team structure, and performance
- Design and enforce the end-to-end incident response and escalation framework, including workflows, ownership models, KPIs, and ensuring adoption across multiple teams
- Act as Incident Commander for high-severity production incidents, driving structured mitigation, clear communication, and long-term resolution
- Own problem management and continuous improvement, identifying recurring failure patterns and translating them into scalable fixes across infrastructure and operations
Discover our full range of relocation jobs with comprehensive support packages to help you relocate and settle in your new location.
What Weโre Looking For
- Minimum of 10+ years of experience in large-scale datacenter environments
- 3+ years of experience leading highly technical teams
- 3+ years of experience building teams (hiring and performance management)
- Experience setting up frameworks, processes, and workflows from scratch
Interested in relocating to Netherlands? Check out our comprehensive Relocation Jobs in Netherlands page with detailed relocation packages and benefits.
Nice to have:
- Deep troubleshooting capability across Linux, server hardware, and firmware (BIOS/BMC), with the ability to guide investigations at a systems engineer level
- Strong familiarity with GPU server platforms and common diagnostics (e.g. nvidia-smi, dcgmi, Linux log correlation)
Similar Jobs
Explore other opportunities that match your interests
TNO
Picnic Technologies