Site Reliability Engineer (Remote)

CareerUS Solutions โ€ข United State
Remote
Apply
AI Summary

We are seeking a skilled Site Reliability Engineer to ensure the reliability, scalability, and performance of our systems and services. The ideal candidate will have experience with Linux/Unix-based systems, scripting or programming languages, and cloud platforms. They will work closely with software engineers, product teams, and operations to build resilient infrastructure and improve system availability.

Key Highlights
Ensure system reliability, scalability, and performance
Monitor system performance and availability
Automate operational tasks to improve efficiency
Technical Skills Required
Linux/Unix-based systems Python Go Bash AWS Azure Google Cloud Platform Docker Kubernetes Prometheus Grafana Datadog
Benefits & Perks
Remote work
Full-time employment

Job Description


Job Title

Site Reliability Engineer (Remote โ€“ United States)

Job Summary

We are seeking a skilled and dependable Site Reliability Engineer to join our engineering team. In this role, you will be responsible for ensuring the reliability, scalability, and performance of our systems and services. You will work closely with software engineers, product teams, and operations to build resilient infrastructure and improve system availability while supporting a culture of continuous improvement.

This position is fully remote within the United States.

Key Responsibilities

  • Design, implement, and maintain reliable, scalable, and highly available systems
  • Monitor system performance and availability, identifying and resolving issues proactively
  • Automate operational tasks to improve efficiency and reduce manual intervention
  • Participate in incident response, root cause analysis, and post-incident reviews
  • Collaborate with development teams to improve application reliability and deployment processes
  • Maintain and enhance monitoring, alerting, and logging systems
  • Ensure systems meet security, compliance, and performance standards
  • Contribute to documentation, runbooks, and best practices for operational excellence
  • Support on-call rotations as needed to maintain system uptime

Required Qualifications

  • Bachelorโ€™s degree in Computer Science, Engineering, or equivalent practical experience
  • 3+ years of experience in Site Reliability Engineering, DevOps, or systems engineering roles
  • Strong experience with Linux/Unix-based systems
  • Proficiency in scripting or programming languages such as Python, Go, or Bash
  • Experience with cloud platforms (AWS, Azure, or Google Cloud Platform)
  • Familiarity with containerization and orchestration tools (Docker, Kubernetes)
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog)
  • Strong problem-solving and troubleshooting skills

Preferred Qualifications

  • Experience with Infrastructure as Code tools (Terraform, CloudFormation)
  • Knowledge of CI/CD pipelines and deployment automation
  • Understanding of networking, security best practices, and distributed systems
  • Prior experience supporting high-availability or large-scale production environments

Soft Skills

  • Strong communication and collaboration skills
  • Ability to work independently in a remote environment
  • Detail-oriented with a proactive approach to problem-solving
  • Commitment to reliability, quality, and continuous improvement

Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Entry level

battery nexus

United State

NLP Engineer

Devops
โ€ข
6h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Bright Vision Technologies

United State

Vector Database Engineer

Devops
โ€ข
7h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Bright Vision Technologies

United State

Subscribe our newsletter

New Things Will Always Update Regularly