Design, build, and operate cloud infrastructure on Azure to support low-latency integrations with the operator systems and multiple external exchanges. Establish Kubernetes-based container orchestration and manage cluster sizing, autoscaling, and multi-env deployments. Own observability across metrics, logging, and distributed tracing using Prometheus, Grafana, and OpenTelemetry (OTEL); drive incident response, SLOs/SLIs, and on-call engineering practices.
Key Highlights
Technical Skills Required
Job Description
Company Overview
Our client is a new, fully remote market maker on prediction markets, funded by a tier 1 operator to build next-generation trading systems, leveraging huge industry IP & resources from the tier 1 operator. The founding team includes true industry veterans with a vision for the future. They are building a lean, fully remote, senior team focused on high-performance systems, pragmatic engineering, and rapid iteration.
Mission
Build and operate the core platform and trading infrastructure that powers this new prediction market trading system.
Why This Role
You will own end-to-end infrastructure, freeing software engineers to ship features rapidly while ensuring reliability, performance, and security. This is the first dedicated infrastructure leadership hire and a force multiplier for the entire engineering team.
What You Will Do
- Design, build, and operate cloud infrastructure on Azure to support low-latency integrations with the operator systems and multiple external exchanges.
- Establish Kubernetes-based container orchestration; manage cluster sizing, autoscaling, and multi-env deployments.
- Stand up and optimize core services: Kafka (event streaming), Redis (caching), and Postgres (OLTP).
- Implement robust CI/CD, secrets management, environment isolation, and infrastructure-as-code (Terraform/Bicep).
- Own observability across metrics, logging, and distributed tracing using Prometheus, Grafana, and OpenTelemetry (OTEL); drive incident response, SLOs/SLIs, and on-call engineering practices.
- Design and manage cloud networking: VNets, subnets, peering, private endpoints, DNS, firewall rules, and secure connectivity between our systems and the operator.
- Harden security and compliance across identity, network segmentation, secrets, and OS baselines within the Azure/Windows ecosystem.
- Drive performance engineering for market data ingestion and order routing; remove bottlenecks across the stack.
- Partner with software engineers to shape service boundaries, data contracts, and platform primitives that accelerate delivery.
- Manage cloud cost efficiency and capacity planning while ensuring reliability and high availability.
- Set standards and grow an infrastructure function that scales with the company.
Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
What You Bring
- Senior-level infrastructure experience (8+ years) including significant time operating production systems at scale.
- Deep cloud infrastructure expertise; the current stack runs on Azure but strong candidates from AWS or GCP backgrounds are equally welcome ...cloud fundamentals matter more than platform-specific familiarity.
- Hands-on with Kubernetes, Kafka, Redis, and Postgres in production.
- Expertise in networking, security, identity (e.g., Azure AD), and infra-as-code (Terraform/Bicep or similar).
- Track record of building reliable, observable platforms with strong CI/CD and release engineering; hands-on experience with Prometheus, Grafana, and OpenTelemetry or comparable observability stacks.
- Strong cloud networking fundamentals: VNet design, private endpoints, DNS, firewall/NSG rules, and secure cross-system connectivity on Azure.
- Performance tuning for latency- and throughput-sensitive workloads.
- Excellent collaboration skills; able to translate product and trading needs into platform capabilities.
- A hands-on, curious approach to AI tooling. The team expects everyone to be actively using and exploring AI tools in their daily work whether for infra-as-code generation, log analysis, runbook automation, or incident triage. You don't need to be an AI researcher, but you should be someone who is genuinely experimenting and keeping pace with a rapidly evolving landscape.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
Nice to Have
- Experience in trading, sports betting, exchanges, financial markets, or other real-time systems.
- Experience with event-driven architectures and stream processing.
- Background in SRE leadership, incident command, or reliability programs.
Similar Jobs
Explore other opportunities that match your interests
Senior Engineering Leader for Infrastructure and Platform
gensyn
Delta System & Software, Inc.