Site Reliability Engineer (FinTech)

OneSparQ • United State
Remote
Apply
AI Summary

Join a growing FinTech company as a Site Reliability Engineer. This fully remote position requires 5+ years of experience in Site Reliability Engineering, with a focus on application performance monitoring, cloud experience, and CI/CD tools.

Key Highlights
5+ years of experience in Site Reliability Engineering
Experience with application performance monitoring and observability tools
Cloud experience - preferably in AWS
Solid experience with scripting languages (e.g. shell scripts, Python)
Git and CI/CD tools (GitHub, Jenkins, etc.)
SOA architecture experience utilizing microservices
Experience with scalable, high-performance, multi-tier, enterprise application development
Technical Skills Required
Datadog AWS Shell Scripting Python Git Jenkins Microservices SOA Architecture
Benefits & Perks
Remote work

Job Description


OneSparQ is looking for a Site Reliability Engineer to join a growing FinTech Company. This position is fully remote.


Required Skills:

  • 5+ years of experience in Site Reliability Engineering
  • Experience debugging complex problems
  • Experience with application performance monitoring and observability tools such
  • as Datadog
  • Cloud experience - preferably in AWS
  • Solid experience with scripting languages (e.g. shell scripts, Python)
  • Git and CI/CD tools (GitHub, Jenkins, etc.)
  • SOA architecture experience utilizing micro services
  • Experience with scalable, high performance, multi-tier, enterprise application
  • development


Additional Skills: (not required)

  • Experience with virtualization, containerization and orchestration (e.g.
  • VMware, Kubernetes, etc.)
  • Experience with provisioning configuration management solutions such as
  • Terraform, Ansible, SaltStack, etc.


Responsibilities:

  • Keeping service up and running or getting it back up and running quickly when failure occurs
  • Deployment of new builds to production
  • Monitoring application performance
  • Work closely with internal partners and teams to ensure that we ship software that meets security, SLA, and performance requirements
  • Implement Operational Automation (IaC) for Monitoring, Managing, Deploying and Validating of Systems/Applications
  • On-call support
  • Manage and expand relationships with internal development and outsourced managed service partners for software systems design and development
  • Triage alerts diagnose/resolve critical issues, manage implementation of changes
  • Coordinate capacity planning
  • Develop CI/CD orchestration systems to reduce friction for software delivery to production
  • Define, execute, and analyze Operational Acceptance Test initiatives
  • Write, update, and use documentation, including runbooks/playbooks
  • Automate work including infrastructure needs, testing, failover solutions, failure mitigation, and much more

Subscribe our newsletter

New Things Will Always Update Regularly