Senior Site Reliability Engineer

Dev.Pro โ€ข Greater Buenos Aires
Remote
This Job is No Longer Active This position is no longer accepting applications
AI Summary

Join Dev.Pro as a Senior Site Reliability Engineer to enhance the stability, observability, and efficiency of our services. Lead initiatives in monitoring, automation, and reliability practices. Collaborate with engineering teams to ensure smooth system operations.

Key Highlights
Lead initiatives to improve stability, observability, and efficiency of critical services
Collaborate with engineering teams to solve complex problems and drive operational excellence
Define and enforce logging, tracing, and metrics standards across services
Technical Skills Required
OpenTelemetry PromQL SPL Datadog Splunk APM Google Cloud APM MuleSoft API gateway observability Logback Serilog JSON logging Java .NET Node.js Python
Benefits & Perks
Remote work
30 paid days off per year
5 paid sick days
Up to 60 days of medical leave
Up to 6 paid days off per year for major family events
Partially covered health insurance after probation
Wellness bonus for gym memberships, sports nutrition, and similar needs after 6 months
Overtime coverage
English lessons and Dev.Pro University programs
Fun online activities and team-building events

Job Description


๐ŸŸข Are you in Brazil or Argentina? Join us as we actively recruit in these locations, offering a comfortable remote environment. Submit your CV in English, and we'll get back to you!

We invite a Senior Site Reliability Engineer to join our dynamic team. In this hands-on role, you'll focus on improving the stability, observability, and efficiency of our services. You'll lead initiatives to enhance monitoring, automation, and reliability practices while collaborating with engineering teams to ensure our systems run smoothly and remain resilient.

๐ŸŸฉ What's in it for you:

  • Join a top S&Pโ€ฏ500 company shaping the future of global payments and financial technology
  • Lead initiatives to improve stability, observability, and efficiency of critical services
  • Collaborate with engineering teams to solve complex problems and drive operational excellence

โœ… Is that you?

  • 5+ years in site reliability, observability, or platform engineering
  • Experience building SRE or observability practices from scratch
  • Hands-on OpenTelemetry experience (SDKs and Collector)
  • Strong experience with PromQL/SPL and at least one APM platform (Datadog, Splunk APM, Google Cloud APM)
  • Experience designing SLOs and alerting strategies (burn rate, multi-window)
  • Familiarity with MuleSoft or API gateway observability
  • Awareness of security best practices (PII redaction, access control)
  • Experience building automation scripts for CI/CD tasks
  • Experience with logging frameworks (Logback, Serilog) and structured JSON logging
  • Collaboration, communication, and independent problem-solving skills
  • Upper-Intermediate+ English level

๐ŸงฉKey responsibilities and your contribution

In this role, you'll own and lead efforts to ensure the reliability, observability, and operational efficiency of our services.

  • Define and enforce logging, tracing, and metrics standards across services
  • Implement and maintain centralized telemetry pipelines and APM integrations
  • Build reusable instrumentation libraries for core languages (Java, .NET, Node.js, Python)
  • Establish dashboards and SLO/error budget alerts
  • Ensure log/trace correlation and schema consistency
  • Implement PII/secret redaction, retention, and cost optimization
  • Collaborate with development teams to onboard services and ensure observability readiness
  • Develop runbook templates, documentation, and training materials for engineering teams
  • Audit alerts, reduce noise, and maintain alert quality standards
  • Support incident response through tooling improvement and post-incident telemetry analysis

๐ŸŽพ What's working at Dev.Pro like?

Dev.Pro is a global company that's been building great software since 2011. Our team values fairness, high standards, openness, and inclusivity for everyone โ€” no matter your background

๐ŸŒ We are 99.9% remote โ€” you can work from anywhere in the world

๐ŸŒด Get 30 paid days off per year to use however you like โ€” vacations, holidays, or personal time

โœ”๏ธ 5 paid sick days, up to 60 days of medical leave, and up to 6 paid days off per year for major family events like weddings, funerals, or the birth of a child

โšก๏ธ Partially covered health insurance after the probation, plus a wellness bonus for gym memberships, sports nutrition, and similar needs after 6 months

๐Ÿ’ต We pay in U.S. dollars and cover all approved overtime

๐Ÿ““ Join English lessons and Dev.Pro University programs, and take part in fun online activities and team-building events

Our next steps:

โœ… Submit a CV in English โ€” โœ… Intro call with a Recruiter โ€” โœ… Internal interview โ€” โœ… Client interview โ€” โœ… Offer

Interested? Find out more:

๐Ÿ“‹How we work

๐Ÿ’ป LinkedIn Page

๐Ÿ“ˆ Our website

๐Ÿ’ปIG Page

Similar Jobs

Explore other opportunities that match your interests

Intermediate Software Engineer - AI Orchestration Platform

Programming
โ€ข
6d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข
Job Type โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข
Experience Level โ€ขโ€ขโ€ขโ€ขโ€ขโ€ข

Dev.Pro

Greater Buenos Aires
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Entry level

Atomic - Remote Jobs

Greater Buenos Aires

Front-End WordPress Developer (Co-Lead)

Programming
โ€ข
11m ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

unpause

India

Subscribe our newsletter

New Things Will Always Update Regularly