Senior Observability Engineer - Datadog SME

Remote
Apply
AI Summary

Join our Digital Ops team as a Senior Observability Engineer with expertise in Datadog. Design, operate, and improve observability capabilities for cloud-native services. Collaborate with DevOps, SRE, and development teams.

Key Highlights
Own and lead observability architecture and strategy
Act as Datadog Subject Matter Expert
Design and implement Datadog dashboards and monitors
Key Responsibilities
Own and lead the observability architecture and strategy across cloud-native services
Act as the Datadog Subject Matter Expert, owning configuration, governance, and best practices
Design, implement, and maintain Datadog dashboards, monitors, alerts, SLOs, and service health views
Operate and optimize Datadog APM, Logs, Metrics, Synthetic Monitoring, and RUM
Drive alert quality improvements, signal-to-noise reduction, and proactive detection of operational issues
Lead Datadog cost management and usage optimization initiatives in collaboration with engineering and finance stakeholders
Partner with development teams to embed observability into the SDLC and production readiness processes
Define and document runbooks, operational procedures, and observability standards
Participate in a shared on-call rotation, triaging and resolving production incidents, acting as incident commander when needed, and leading post-incident reviews
Technical Skills Required
Datadog Azure Kubernetes Terraform Docker GitOps
Benefits & Perks
Excellent compensation
Hardware setup for working from home
Agile methodologies
Nice to Have
ArgoCD
Azure DevOps CI/CD pipelines
Databricks
SQL-based systems

Job Description


πŸ“Œ Senior Observability Engineer – Datadog SME (LATAM)

We are looking for a Senior Observability Engineer with deep expertise in Datadog to join our Digital Ops team. This role is focused on owning and evolving the observability strategy for a large-scale, cloud-native environment supporting 150+ production services across multiple regions.

As a Datadog Subject Matter Expert, you will be responsible for designing, operating, and continuously improving observability capabilities, enabling engineering teams to build reliable, performant, and cost-efficient systems. You will work closely with DevOps, SRE, and development teams in an agile environment, acting as a technical reference for observability best practices.

πŸ—“ Start date: ASAP

πŸ“† Contract type: Full-Time, Remote, Contractor

🌐 Work hours and location: 8.00 am to 4.00 PM MST

πŸ› οΈ What You’ll Be Doing

  • Own and lead the observability architecture and strategy across cloud-native services running in multiple environments and regions.
  • Act as the Datadog Subject Matter Expert, owning configuration, governance, and best practices.
  • Design, implement, and maintain Datadog dashboards, monitors, alerts, SLOs, and service health views.
  • Operate and optimize Datadog APM, Logs, Metrics, Synthetic Monitoring, and RUM.
  • Drive alert quality improvements, signal-to-noise reduction, and proactive detection of operational issues.
  • Lead Datadog cost management and usage optimization initiatives in collaboration with engineering and finance stakeholders.
  • Partner with development teams to embed observability into the SDLC and production readiness processes.
  • Define and document runbooks, operational procedures, and observability standards.
  • Eventually participate in a shared on-call rotation, triaging and resolving production incidents, acting as incident commander when needed, and leading post-incident reviews.
  • Continuously identify opportunities for automation and toil reduction across observability and operational workflows.
  • Set, track, and report on operational excellence metrics including reliability, performance, availability, security, and cost.

βœ… What You Need to Succeed

Must-haves

  • 3+ years of deep, hands-on experience with Datadog as an observability platform in production environments.
  • 5+ years of experience in DevOps, SRE, or Cloud Engineering roles supporting customer-facing systems.
  • Strong practical experience with Datadog APM, Logs, Metrics, dashboards, monitors, alerts, and SLOs.
  • Hands-on experience with Azure, Kubernetes, Terraform, Docker, and GitOps-based workflows.
  • Proven experience operating 24x7 production environments, including incident response, root cause analysis, and post-mortems.
  • Solid understanding of cloud-native architectures, distributed systems, and modern observability principles.
  • Ability to work independently in a fully remote, distributed team, with strong communication and collaboration skills.

Nice to have

  • Experience with ArgoCD, Azure DevOps CI/CD pipelines, and infrastructure automation.
  • Exposure to Databricks, SQL-based systems, or data-intensive platforms.
  • Hands-on experience building or extending custom DevOps/SRE tooling to reduce operational toil.
  • Relevant certifications (e.g. Datadog, Azure, Cloud Architecture, ITIL).

🧭 Our Recruitment Process

Here’s what to expect from our candidate-friendly interview process:

  • Initial Interview – 60 minutes with our Talent Acquisition Specialist
  • Culture Fit – 30 minutes with our Team Engagement Manager
  • Technical Assessment – Online Challenge/Multiple Choice Questionnaire
  • Final Stage – 60 minutes with the Hiring Manager

🌟 Why Join Launchpad?

We believe that great work starts with great people. At Launchpad, we offer:

  • People first culture
  • Excellent compensation
  • Hardware setup for working from home
  • Agile methodologies
  • Diverse and multicultural work environment
  • Training allowances …and more!

✨ Ready to make your mark? Apply now and be part of something exciting.

Similar Jobs

Explore other opportunities that match your interests

Junior Technical Support Analyst

Devops
β€’
3d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

β€’β€’β€’β€’β€’β€’ β€’β€’β€’β€’β€’β€’ β€’β€’β€’β€’β€’β€’
Job Type β€’β€’β€’β€’β€’β€’
Experience Level β€’β€’β€’β€’β€’β€’

rocket.chat

Brazil

Business Intelligence Engineer

Devops
β€’
1w ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

entrupy

Brazil

Cloud Security Engineer

Devops
β€’
1w ago

Premium Job

Sign up is free! Login or Sign up to view full details.

β€’β€’β€’β€’β€’β€’ β€’β€’β€’β€’β€’β€’ β€’β€’β€’β€’β€’β€’
Job Type β€’β€’β€’β€’β€’β€’
Experience Level β€’β€’β€’β€’β€’β€’

KnowBe4

Brazil

Subscribe our newsletter

New Things Will Always Update Regularly