Senior DevOps Engineer - Cloud Native Infrastructure Lead

Relocation
Apply
AI Summary

Lead a global DevOps team, design and build cloud infrastructure on GCP, and drive automation, reliability, and scalability.

Key Highlights
Lead and mentor a team of DevOps engineers
Design, build, and improve cloud infrastructure on Google Cloud Platform (GCP)
Manage Kubernetes and Terraform environments in production
Technical Skills Required
Kubernetes Terraform Bash Python Google Cloud Platform (GCP) ArgoCD Jenkins GitLab CI Prometheus Grafana
Benefits & Perks
Collaborative working environment
State-of-the-art tools & equipment
Inclusive corporate culture

Job Description


Our client is a fast-growing AI technology company redefining how large-scale dynamic pricing is handled in real-time. The company is partnering with some of the world’s leading airlines to transform how pricing is done at scale. They've developed a cutting-edge platform that enables enterprise clients to move beyond manual pricing models and embrace a fully autonomous, AI-driven system that dynamically adjusts to real-time market conditions.

In this role, you will lead a global DevOps team responsible for building and optimizing complex, cloud-native infrastructure supporting Fetcherr’s AI-powered platform. You’ll work on high-performance systems deployed on Google Cloud Platform (GCP), driving automation, reliability, and scalability across multiple environments. It’s an excellent opportunity to become a key expert within the organization — with real autonomy in decision-making, influence over technical direction, and an environment that actively encourages your ideas and innovation.

This position is hybrid in Miami, and the company welcomes candidates willing to relocate.


Responsibilities

Lead and mentor a team of DevOps engineers to deliver reliable, secure, and scalable infrastructure

Design, build, and improve cloud infrastructure on Google Cloud Platform (GCP) for high performance and resilience

Manage Kubernetes and Terraform environments in production, ensuring uptime and deployment efficiency

Automate CI/CD pipelines, release processes, and infrastructure management to speed up delivery and reduce errors

Set up and maintain monitoring, alerting, and logging systems (Prometheus, EFK, GCP Monitoring) for early issue detection and fast resolution

Work closely with development, data, and product teams to align infrastructure with business and technical goals

Define and enforce Infrastructure as Code (IaC) standards for consistency and reliability

Improve internal tools to simplify deployments and boost developer productivity

Evaluate new tools and technologies that can strengthen performance, security, and scalability


Requirements

6+ years in DevOps, Site Reliability Engineering, similar positions

2+ years leading / managing engineering teams

Strong experience with Kubernetes in production and advanced Helm chart management

Deep knowledge of Terraform and Infrastructure as Code

Solid scripting skills in Bash, Python, or another relevant language

3-4 years of hands-on experience with GCP deployments and services

Experience building and maintaining CI/CD pipelines (ArgoCD, Jenkins, GitLab CI, etc.)

Strong background in monitoring and observability (Prometheus, Grafana, GCP Monitoring)


Nice to Have

Experience building Kubernetes operators or extending ArgoCD

Familiarity with Big Data or MLOps environments

Knowledge of Airflow, Kubeflow, or MLFlow


The company is committed to creating a diverse environment and is proud to be an equal-opportunity employer. They provide a collaborative working environment along with resources, and state-of-the-art tools & equipment to promote success; and a welcoming, inclusive corporate culture where individuals are recognized for their contributions.


Subscribe our newsletter

New Things Will Always Update Regularly