Senior Site Reliability Engineer

Doghouse Recruitment • United State

Remote Visa Sponsorship

This Job is No Longer Active This position is no longer accepting applications

AI Summary

We're building a cloud platform for high-throughput, compute-heavy workloads. As a Senior SRE, you'll own production reliability end-to-end, define SLIs/SLOs, run error budget conversations, and ship changes that reduce incidents and improve latency.

Key Highlights

Define SLIs/SLOs

Run error budget conversations

Ship changes to reduce incidents and improve latency

Technical Skills Required

Linux Kubernetes Terraform Docker Helm Go Python C++

Benefits & Perks

Up to $225k base salary

Additional bonus

Stock options

Full remote work in the US

Resident permit required

Job Description

We’re building a cloud platform for high-throughput, compute-heavy workloads. We operate large-scale infrastructure where failure modes are real, capacity is finite, and reliability needs to be engineered, not “handled”.

As a Senior SRE, you’ll own production reliability end-to-end: define SLIs/SLOs, run error budget conversations, and ship changes that reduce incidents and improve latency (p95/p99). You’ll build automation to kill toil, raise deployment safety (canary/rollback), and turn observability into signal instead of noise.

This is a bare-metal environment, think Linux, data centers, physical fleets, and real hardware constraints, not managed services. You’ll work close to the metal across Kubernetes internals (scheduling, autoscaling behavior, kubelet pressure/evictions, etcd/control plane), Linux performance (CPU/memory/IO contention), and network debugging (DNS/TCP/TLS, packet loss, congestion). On-call is part of the job, but success is measured by how much you reduce it.

Requirements

Senior-level experience in Site Reliability Engineering / Production Engineering running bare metal / on-prem / data center infrastructure (not public cloud only)
Deep hands-on expertise in Linux systems debugging and performance (CPU, memory, IO, kernel-level behaviors)
Strong understanding of networking (DNS/TCP/TLS, latency, packet loss, congestion, troubleshooting under load)
Strong Kubernetes experience beyond manifests: scheduler behavior, autoscaling edge cases, kubelet pressure/evictions, etcd/control plane
Experience with Terraform, Docker, Helm, and modern CI/CD practices
Some coding skills in Go and/or Python and/or C++

Are you looking for complexity and a new place to nerd-out on optimisation of infrastructure, please apply.

BASE SALARY: up to 225k, additional bonus and stock

FULL REMOTE IN USA

Resident permit required.

Job Overview

Posted Date Jan 15, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location United State

Annual Salary 225,000 USD

Category Devops

Company Doghouse Recruitment

Mentioned Skills

Industries

Similar Jobs

Explore other opportunities that match your interests

Technical Business Analyst for Higher Education Integrations

Devops

•

4h ago

Visa Sponsorship Relocation Remote

Job Type Contract

Experience Level Mid-Senior level

CDW

United State

Quality & DevOps Engineer

Devops

•

7h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Not Applicable

three+one

United State

Senior AWS Cloud Engineer

Devops

•

8h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Bright Vision Technologies

United State

Senior Site Reliability Engineer

Key Highlights

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Technical Business Analyst for Higher Education Integrations

CDW

Quality & DevOps Engineer

three+one

Senior AWS Cloud Engineer

Premium Job

Bright Vision Technologies

Subscribe our newsletter