We are seeking a Senior DevOps Engineer to own and enhance developer experience end-to-end, including deployments, CI/CD pipelines, and dev environments. You will work alongside an existing DevOps engineer to migrate legacy services and ensure every engineer ships faster with less friction. The role requires strong coding skills, deep Kubernetes and cloud experience, and a focus on making smart trade-offs for a team serving millions of users.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
About Replika
Ten years ago, we built the world’s first AI companion — before LLMs, before ChatGPT, and before anyone knew what AI could be.
What started as a way to patch a hole in our hearts became something unexpected: a catalyst. Millions of people told us Replika helped them reconnect with the world. They texted that old friend. They picked up that forgotten hobby. They took that walk around town. While today’s tech keeps people scrolling, we discovered AIs can push us outward, if we let them.
Now we’re being reborn with a new vision: the first AI with heart, for making the most out of life. Gentle nudges to meet friends when you’re feeling shy. Ideas for exploring new places when it’s easier to stay home. Daily conversation about whatever moves you — ballet, philosophy, aliens, or gossip — at 2am when no one else is awake. We’re not building an AI to validate or pacify. We’re building something modeled after the relationships that transform us. Someone who asks the right questions at the right time. Someone who helps you look inward, and pushes you outward. We’ve been featured in TED Talks, Stanford and Harvard studies, the Lex Fridman podcast, because 40M+ people connect with Replika in a deeply human way. And we’re just getting started.
About The Role
We need a senior DevOps engineer who treats developer experience as a first-class product. You’ll work alongside our existing DevOps engineer, who’s focused on migrating legacy services into a clean deployment story. Your job is different: make every engineer at Replika ship faster, with less friction, fewer footguns, and more confidence.
This isn't a hyperscaler role - we serve millions, not billions, so the work looks more like smart trade-offs across a small senior team than mega-scale architecture. We need someone who understands how engineering actually works, writes real code, and genuinely cares about the daily experience of the people building the product.
You'll also need product fluency — not at a backend engineer's depth, but enough to know what runs on which cluster and why. If image moderation can run as a batch job on cheaper preemptible GPUs with a few seconds of latency, you should spot the cost optimization and propose it, not wait for someone else to.
Responsibilities
- Own developer experience end-to-end: deployments, staging, dev environments, CI/CD pipelines. Find what’s painful, fix it, then make it stay fixed.
- Be a force multiplier for the engineering team. Every hour you save the team compounds. You think in those terms.
- Maintain, improve, and sometimes rebuild our CI/CD pipelines. Deploys should be safe by default — it should be hard to break things and easy to recover when something does break.
- Work alongside our existing DevOps engineer on the broader infra picture: cluster management, observability, secrets, networking, cost. You split the work based on what each of you is best placed to own.
- Build and maintain templates and tooling for spinning up new services and dev environments. The 47th time someone needs a new service, it should take 10 minutes, not 3 days.
- Understand what runs on our clusters and why. Recommend and implement efficiency wins where you see them — right-sizing, autoscaling, scheduling, GPU utilization. You don’t need permission to make sensible cost calls.
- Write real code when the situation calls for it. Internal tooling, glue services, automation scripts — you don’t hand it off to a backend engineer.
- Care about the small stuff: clear runbooks, sane defaults, good error messages, deploys that fail safely. The kind of work that nobody notices when it’s working well — which is the point.
Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- 5+ years in DevOps, SRE, or platform engineering at companies where you actually owned developer experience, not just kept the lights on.
- Strong coding skills. You write production-quality Python (or similar), not just shell scripts and YAML. You can read backend code well enough to debug a deploy issue without escalating.
- Deep experience with Kubernetes and Docker. You can debug a pod that’s misbehaving, design sensible deployment configs, and reason about resource limits without guessing.
- Hands-on with cloud platforms (AWS or GCP). You’ve managed real workloads in production, not just done a certification.
- CI/CD chops. TeamCity experience is a plus, but more important is that you’ve owned a pipeline end-to-end and made it fast, reliable, and easy to understand. We use TeamCity and GitHub Actions.
- You think in terms of developer experience. You’ve made other engineers’ lives meaningfully better and can talk specifically about how.
- Product instincts. You’re comfortable asking “what does this service actually do” and using the answer to make infra decisions.
- Strong English (B2+). We’re distributed across time zones — clear written communication matters.
- Familiarity with observability stacks (Prometheus, Grafana, Datadog, or similar) and a view on what’s worth alerting on versus what’s noise. You find observability gaps and make sure they are filled.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Experience with GPU clusters and ML workloads — model serving, autoscaling inference, cost optimization for GPU-heavy services.
- Interest, backed with projects, in AI-first engineering workflows for AI assisted engineering.
- Familiarity with observability stacks (Prometheus, Grafana, Datadog, or similar) and a view on what’s worth alerting on versus what’s noise.
- Experience with Terraform, Pulumi, or another IaC tool. You version your infrastructure.
- You’ve introduced a developer experience improvement at a previous company that you’re genuinely proud of — something you can describe in detail and quantify the impact of.
- Experience at a B2C company with real users on the line, where downtime translates directly to user trust.
- Competitive compensation.
- Flexible job schedule and generous vacation policy.
- Unlimited access to the latest AI coding assistants.
- Direct impact on millions of worldwide users within months.
- Push the boundaries of applied AI in a consumer setting.
- Fully remote.
Similar Jobs
Explore other opportunities that match your interests
Bright Vision Technologies
Jobs via Dice
Senior AWS DevOps Engineer