Senior Observability Engineer - Datadog SME

Remote
Apply
AI Summary

Join Launchpad as a Senior Observability Engineer with expertise in Datadog to lead the observability strategy for a large-scale cloud-native environment. Design, operate, and improve observability capabilities to enable engineering teams to build reliable systems. Collaborate with DevOps, SRE, and development teams in an agile environment.

Key Highlights
Lead observability strategy for a large-scale cloud-native environment
Design, operate, and improve observability capabilities
Collaborate with DevOps, SRE, and development teams
Key Responsibilities
Own and lead the observability architecture and strategy
Act as the Datadog Subject Matter Expert
Design, implement, and maintain Datadog dashboards, monitors, alerts, SLOs, and service health views
Technical Skills Required
Datadog Azure Kubernetes Terraform Docker GitOps ArgoCD Azure DevOps CI/CD pipelines Databricks SQL-based systems
Benefits & Perks
Excellent compensation
Hardware setup for working from home
Agile methodologies
Diverse and multicultural work environment
Training allowances
Nice to Have
Experience with ArgoCD
Azure DevOps CI/CD pipelines
Infrastructure automation
Relevant certifications

Job Description


Who We Are

Launchpad is a global technology partner connecting top talent with high-impact projects across North America and beyond. We specialize in Staff Augmentation and product development, helping companies scale with agility while empowering professionals to grow in meaningful, remote-first environments.

๐Ÿ“Œ Senior Observability Engineer โ€“ Datadog SME (LATAM)

We are looking for a Senior Observability Engineer with deep expertise in Datadog to join our Digital Ops team. This role is focused on owning and evolving the observability strategy for a large-scale, cloud-native environment supporting 150+ production services across multiple regions.

As a Datadog Subject Matter Expert, you will be responsible for designing, operating, and continuously improving observability capabilities, enabling engineering teams to build reliable, performant, and cost-efficient systems. You will work closely with DevOps, SRE, and development teams in an agile environment, acting as a technical reference for observability best practices.

๐Ÿ—“ Start date: ASAP

๐Ÿ“† Contract type: Full-Time, Remote, Contractor

๐ŸŒ Work hours and location: 8.00 am to 4.00 PM MST

๐Ÿ› ๏ธ What You'll Be Doing

  • Own and lead the observability architecture and strategy across cloud-native services running in multiple environments and regions.
  • Act as the Datadog Subject Matter Expert, owning configuration, governance, and best practices.
  • Design, implement, and maintain Datadog dashboards, monitors, alerts, SLOs, and service health views.
  • Operate and optimize Datadog APM, Logs, Metrics, Synthetic Monitoring, and RUM.
  • Drive alert quality improvements, signal-to-noise reduction, and proactive detection of operational issues.
  • Lead Datadog cost management and usage optimization initiatives in collaboration with engineering and finance stakeholders.
  • Partner with development teams to embed observability into the SDLC and production readiness processes.
  • Define and document runbooks, operational procedures, and observability standards.
  • Eventually participate in a shared on-call rotation, triaging and resolving production incidents, acting as incident commander when needed, and leading post-incident reviews.
  • Continuously identify opportunities for automation and toil reduction across observability and operational workflows.
  • Set, track, and report on operational excellence metrics including reliability, performance, availability, security, and cost.

โœ… What You Need to Succeed

Must-haves

  • 3+ years of deep, hands-on experience with Datadog as an observability platform in production environments.
  • 5+ years of experience in DevOps, SRE, or Cloud Engineering roles supporting customer-facing systems.
  • Strong practical experience with Datadog APM, Logs, Metrics, dashboards, monitors, alerts, and SLOs.
  • Hands-on experience with Azure, Kubernetes, Terraform, Docker, and GitOps-based workflows.
  • Proven experience operating 24x7 production environments, including incident response, root cause analysis, and post-mortems.
  • Solid understanding of cloud-native architectures, distributed systems, and modern observability principles.
  • Ability to work independently in a fully remote, distributed team, with strong communication and collaboration skills.

Nice to have

  • Experience with ArgoCD, Azure DevOps CI/CD pipelines, and infrastructure automation.
  • Exposure to Databricks, SQL-based systems, or data-intensive platforms.
  • Hands-on experience building or extending custom DevOps/SRE tooling to reduce operational toil.
  • Relevant certifications (e.g. Datadog, Azure, Cloud Architecture, ITIL).

๐Ÿงญ Our Recruitment Process

Here's what to expect from our candidate-friendly interview process:

  • Initial Interview โ€“ 60 minutes with our Talent Acquisition Specialist
  • Culture Fit โ€“ 30 minutes with our Team Engagement Manager
  • Technical Assessment โ€“ Online Challenge/Multiple Choice Questionnaire
  • Final Stage โ€“ 60 minutes with the Hiring Manager

๐ŸŒŸ Why Join Launchpad?

We believe that great work starts with great people. At Launchpad, we offer:

  • People first culture
  • Excellent compensation
  • Hardware setup for working from home
  • Agile methodologies
  • Diverse and multicultural work environment
  • Training allowances โ€ฆand more!

โœจ Ready to make your mark? Apply now and be part of something exciting.

Compliance & Equal Opportunity

Launchpad is an equal opportunity employer committed to creating an inclusive environment for all applicants. We do not discriminate on the basis of race, color, religion, gender identity, sexual orientation, age, disability, or any other protected status under applicable laws in Canada and British Columbia.

All candidate information will be handled confidentially and used solely for recruitment purposes in accordance with applicable privacy regulations.

Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Sky Systems, Inc. (SkySys)

Argentina

Senior Backend Engineer

Devops
โ€ข
5d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Lumenalta

Argentina

Senior Systems Engineer, Production

Devops
โ€ข
2h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข
Job Type โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข
Experience Level โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข

clio

United Kingdom

Subscribe our newsletter

New Things Will Always Update Regularly