Data Center Operations Lead

IREN Canada
Relocation
Apply
AI Summary

Lead data center operations, develop new projects, and drive impactful initiatives in a high-growth environment. Develop expertise in real-time incident management, GPU compute clusters, and facility infrastructure. Collaborate with industry professionals to innovate and improve technical excellence.

Key Highlights
Lead data center operations
Develop new projects
Drive impactful initiatives
Key Responsibilities
Serve as primary real-time decision-maker for active incidents
Monitor IOC dashboards for GPU cluster, network, and facility alerts
Validate ticket prioritization and ensure accurate routing
Technical Skills Required
Real-time incident management GPU compute clusters Facility infrastructure ITSM ticketing Smart-hands support Shift handovers Risk communication Inter-team collaboration
Benefits & Perks
Competitive hourly rate
RRSP matching program
Relocation assistance
Comprehensive extended health and dental coverage
Paid vacation
Professional development
Company events and team-building activities

Job Description


Job Description

Job Type:  Full-time |Location:  Prince George, BC |Department:  Operations |Reporting to:  IOC Manager |Work Location Type: #onsite


IREN is a leading AI Cloud Service Provider, delivering large-scale GPU clusters for AI training and inference.  IREN’s vertically integrated platform is underpinned by its expansive portfolio of grid-connected land and data centers in renewable-rich regions across the U.S. and Canada.

 With 100% renewable energy, we build, own and operate our data centers and take pride in being at the forefront of sustainable solutions for the ever-evolving applications of high-performance compute. We believe that human progress is invaluable, but it should be done in the right way – responsibly, sustainably and having a positive impact on the communities we operate in.   

As the Data Center Operations Leadyou'll play a pivotal role in developing new data center projects. This is your chance to thrive in a high-growth environment, driving impactful projects at the cutting edge of energy and technology infrastructure.

Job Requirements

  • 3+ years in mission-critical, 24/7 data center operations, including leadership, mentoring, and shift-lead roles for small teams.
  • Expert in real-time incident management, ensuring accurate categorization, prioritization, triage, and stabilization of IT hardware, network, and facility issues.
  • Skilled in monitoring GPU compute clusters, network environments, and facility infrastructure, validating alerts and coordinating immediate response actions.
  • Proficient in ITSM ticketing: validation, prioritization, routing, and initial root-cause assessment for engineering follow-up.
  • Experienced in coordinating on-site smart-hands support, directing technicians and facilities staff based on incident priority and operational impact.
  • Strong commitment to operational continuity through shift handovers, accurate logging, risk communication, and clear inter-team collaboration.
  • Bachelor’s degree in a technical field or equivalent military/technical operations experience.
  • Client Obsession: You take pride in delivering outstanding service. 
  • Teamwork: You collaborate openly and support others’ success. 
  • Accountability: You own your work from start to finish. 
  • Curiosity: You ask questions, learn quickly, and seek to improve. 
  • Cultural Fit: You bring positivity, reliability, and a growth mindset to every interaction.  
This role requires availability for a rotational shift schedule including day, night, and weekend coverage. Candidates must be comfortable working overnight and maintaining productivity during off-peak hours. A shift premium may apply.

Job Responsibilities

  • Serve as the primary real-time decision-maker for active incidents while embedded onsite with shift teams, ensuring accurate categorization, prioritization, timely triage, clear communication, and stabilization actions across IT hardware, network, GPU clusters, and facility systems.
  • Monitor IOC dashboards for GPU cluster, network, and facility alerts, validating signal quality, assessing operational impact, and coordinating appropriate response actions in real time alongside the shift team.
  • Validate ticket prioritization, ensure accurate routing, and identify emerging ticketing patterns or repeat issues, directing follow-up tasks to IOC analysts, day-shift leads, or on-site technicians as needed.
  • Coordinate smart-hands activities and support requests related to GPU clusters, facility systems, and network operations, providing guidance to technicians and ensuring safe, timely execution.
  • Escalate critical or high-impact incidents to the IOC Manager or Tier 2/Engineering teams with clear context, documented evidence, and recommended next steps.
  • Perform initial root-cause assessment (RCA) by collecting evidence, timelines, logs, and observations to establish category and priority, then coordinate the handoff of deeper investigation and full RCA tasks to IOC analysts and day-shift leads.
  • Conduct structured shift handover briefings and maintain precise operational logs, ensuring operational continuity, situational awareness, and seamless transition between shifts.
  • Contribute to process improvements by identifying operational gaps, recurring issues, or workflow inefficiencies observed during the shift, and proposing actionable solutions.
  • Provide guidance and mentorship to shift analysts, fostering consistent application of incident management standards and best practices while embedded onsite.
Why Join Us?
  • Be part of a mission-critical environment that supports high-performance computing
  • Collaborate with industry professionals who are passionate about technology, reliability, and efficiency
  • Contribute to a team that values innovation, growth, and technical excellence

Job Benefits

Compensation & Rewards 
  • Competitive hourly rate, finalized based on experience and impact
  • RRSP matching program to help you plan for your future 
  • Relocation assistance and support to get you settled 
Wellbeing & Benefits 
  • Comprehensive extended health and dental coverage to keep you and your family supported
  • Paid vacation to recharge, travel, or simply enjoy more life outside of work 

Growth & Development 

  • Professional development to support certifications, continuing education, or role related training

Community & Culture 

  • Company events and team-building activities

We value diverse perspectives and believe that skills can be developed. If you’re passionate about this role, we want to hear from you — whether you meet every criteria or not. Your unique experiences might be exactly what we need!   

Podtech Data Centers Inc., the employing entity and proud member of the IREN Group is an equal opportunity employer that is committed to creating an inclusive workplace. We evaluate qualified applicants without regard to race, colour, religion, age, sex, sexual orientation, gender identity, genetic information, national origin, disability, veteran status, and other legally protected characteristics.  

By applying for this position and submitting your resume and application materials, you consent to the processing of your personal information in accordance with our Job Applicant Privacy Statement available on our website at www.iren.com.  

 



Similar Jobs

Explore other opportunities that match your interests

FDM Consultant

Programming
1d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

FDM Group

Canada

Management Trainee

Programming
1d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Entry level

Emco Corporation: Plumbing, HV...

Canada

Senior Delivery Manager for AI and Cloud Transformation Programs

Programming
1d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Robots & Pencils

Canada

Subscribe our newsletter

New Things Will Always Update Regularly