Lead large-scale SRE programs with 15-20 years of experience. Manage teams and drive organizational change. Expertise in cloud-native architectures and observability tools.
Key Highlights
Technical Skills Required
Job Description
- 15-20 years of experience
- 8+ years in leadership roles managing large-scale SRE Programs
- Deep understanding of cloud-native architectures (AWS, Azure, GCP), microservices, and distributed systems.
- Proficiency in using Application Performance Monitoring (APM) tool New Relic/Dynatrace for monitoring, logging, tracing and Splunk for Log monitoring.
- Expertise in observability tools (e.g., Prometheus, Grafana, Datadog), CI/CD pipelines, and infrastructure as code (Terraform, Ansible).
- Strong experience with incident response, chaos engineering, and reliability testing.
- Proven ability to influence cross-functional teams and drive organizational change.
Similar Jobs
Explore other opportunities that match your interests
hire feed
Senior Site Reliability Engineer, Release
Alkami Technology
Senior Lead DevOps Engineer - Mission-Critical Health IT Solutions