Senior Site Reliability Engineer
Rev is hiring a Senior Site Reliability Engineer to join its Platform Engineering team, building and evolving the platform that engineering relies on, in Brazil.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Job profile:- Senior Site Reliability Engineer (SRE)/DevOps
Location : Brazil (REMOTE)
Job: Long Term Contract
Job Description:
Role Overview
Rev is hiring a Senior Site Reliability Engineer (SRE) to join its Platform Engineering team. This role is responsible for the shared AWS infrastructure that supports Rev’s core products, including a large monolithic web application as well as a growing set of microservices. This is not a “ticket‑based ops” role — it’s about building and evolving the platform that engineering relies on.
Core Responsibilities
Infrastructure & Reliability
- Own and manage shared AWS infrastructure used across the company
- Maintain and operate EKS clusters
- Ensure reliability, scalability, and performance of production systems
- Monitor infrastructure health and proactively address issues
Observability & Monitoring
- Own monitoring, logging, and alerting across infrastructure and applications
- Heavy use of:
- Grafana
- OpenSearch clusters
- Design alerts that:
- Detect infra and application issues early
- Are actionable (not noisy)
- Drive observability standards across teams
Searching for Devops roles that provide visa sponsorship? Connect with international employers through Devops Jobs with Visa Sponsorship opportunities actively seeking talented professionals.
CI/CD & Automation
- Design, build, and maintain CI/CD pipelines
- Improve deployment safety, speed, and consistency
- Automate infrastructure and development workflows
- Partner closely with Engineering and QA to support reliable releases
Must‑Have Experience
- Senior‑level experience in SRE, DevOps, or Platform Engineering
- Strong AWS experience
- Infrastructure as Code (Terraform preferred)
- Kubernetes / EKS in production environments
- Designing and operating CI/CD pipelines
- Hands‑on experience with observability tooling
- Monitoring
- Logging
- Alerting (Grafana or similar)
Senior Site Reliability Engineer (SRE)
Platform Engineering
How this role will Serve, Own, and Grow at Rev
At Rev, we’re on a mission to understand the human voice. We’re looking for a Senior SRE to join our Platform Engineering team and help design, scale, and optimize our cloud-based production infrastructure. This is a high-impact role for someone who thrives in a startup environment and is excited to shape the future of Rev’s platform as we scale.
Explore our comprehensive directory of visa sponsorship jobs from employers worldwide who are ready to sponsor talented international professionals.
This role is ideal for someone passionate about automation, observability, reliability, and continuous improvement—and who enjoys solving problems at scale.
Responsibilities
- Manage infrastructure and observability for Rev’s cloud-based applications
- Design and maintain CI/CD pipelines for scalable, testable deployments
- Automate development and infrastructure workflows to improve velocity
- Partner with Engineering and QA to deliver reliable environments and tooling
- Analyze infrastructure performance and implement data-driven optimizations
- Enhance monitoring and alerting to proactively prevent service degradation
- Help define and execute the DevOps roadmap for a growing Engineering org
- Contribute to internal tooling, including our custom chatbot (“Chopper”)
Qualifications
- BS or MS in Computer Science or a related technical field
- 5+ years of experience in SRE, DevOps, or Software Engineering
- Strong AWS experience with Terraform or other IaC tools
- Experience running containerized workloads (Kubernetes / EKS)
- Deep familiarity with CI/CD pipelines across APIs, web apps, and data services
- Experience with observability tooling (monitoring, alerting, logging) such as Grafana
- Strong communication skills in distributed, remote-first environments
- Comfortable operating in a fast-paced, evolving startup culture
Nice to have
- LGTM stack (Loki, Grafana, Tempo, Mimir)
- Redis, SQL Server, Elasticsearch
- Jenkins, GitHub Actions, Spinnaker
- Experience supporting large or highly distributed teams
Similar Jobs
Explore other opportunities that match your interests
missing-link.io
Ringside Talent
Site Reliability Engineer - Environment Automation