Senior Infrastructure Engineer - Healthcare SaaS

The Judge Group United State
Remote
Apply
AI Summary

Build reliable infrastructure for healthcare at scale. Own and strengthen the infrastructure behind a mission-critical healthcare SaaS platform. Work on high-availability systems that process sensitive healthcare data and support always-on clinical workflows.

Key Highlights
Healthcare Impact at SaaS Scale
High-Availability by Design
True Ownership
Key Responsibilities
Provision, manage, and optimize virtualized environments
Administer production Linux systems
Troubleshoot OS-level issues
Write solid shell scripts
Implement and maintain backup strategies
Configure and maintain monitoring and alerting systems
Manage and troubleshoot IP addressing, routing, VLANs, DNS, DHCP, firewalls, and VPNs
Support secure network segmentation and access controls
Configure and maintain VoIP platforms
Install, configure, back up, and maintain MySQL, MariaDB, and/or MongoDB systems
Technical Skills Required
Linux administration Virtualization (VMware vSphere/ESXi, Hyper-V, Proxmox) Containerized services AI inference systems High-performance compute MySQL MariaDB MongoDB Nagios Icinga Asterisk FreePBX Synology Active Backup ESXi CBT-based backups Snapshots Replication Offsite storage Private LLM access Model serving AI agents Enterprise knowledge retrieval
Benefits & Perks
Primary remote work
Salary range not explicitly stated
Nice to Have
Hands-on experience in HIPAA-regulated or compliance-driven environments
Familiarity with healthcare platforms like EHR, ePCR, telehealth, or care coordination systems
Experience with automation, configuration management, or infrastructure-as-code practices
Exposure to GPU servers, containerized AI workloads, or self-hosted model-serving platforms
Familiarity with private LLM deployments, vector search, RAG architectures, or enterprise AI platforms

Job Description


Build Reliable Infrastructure for Healthcare at Scale

If you’re the kind of engineer who takes uptime personally, treats security as non‑negotiable, and likes systems that actually matter, this role is for you. You’ll work on infrastructure that directly supports healthcare delivery through a high‑scale SaaS platform - systems clinicians and care teams rely on every minute of the day.

This is real production infrastructure, real responsibility, and real impact.

The Role

This position focuses on owning and strengthening the infrastructure behind a mission‑critical healthcare SaaS platform. You’ll be responsible for reliability, security, and performance across systems that process sensitive healthcare data and support always‑on clinical workflows.

You’ll operate in a regulated environment where downtime isn’t an option, data integrity is sacred, and disaster recovery must actually work — not just look good in a diagram. You’ll also support emerging AI infrastructure, including systems used for private LLM access, model serving, AI agents, and enterprise knowledge retrieval, all within healthcare‑grade security and compliance boundaries.

This role is hands‑on, Linux‑heavy, and built for engineers who like precise operations, strong controls, and clean execution.


Onsite requirements are very light. You can imagine being onsite for three days in a row and then being remote for a month straight. Depends on project workload but the role is primarily remote.


Candidates must live in Chicagoland or NW Indiana and this role is not willing to provide sponsorship now or in the future.


Why This Work Is Worth Doing

  • Healthcare Impact at SaaS Scale
  • Your work directly enables care coordination, clinical workflows, and real‑time communication across healthcare organizations.
  • High‑Availability by Design
  • Redundancy, fault tolerance, and DR aren’t wishlist items — they’re baked into how the platform runs.
  • True Ownership
  • You own production systems end‑to‑end: provisioning, monitoring, incident response, and continuous improvement.
  • Security Comes First
  • Compliance, auditability, and data protection are core operating principles, not box‑checking exercises.


What You’ll Own


Virtualization & Infrastructure

  • Provision, manage, and optimize virtualized environments using platforms like VMware vSphere/ESXi, Hyper‑V, Proxmox, or equivalents.
  • Support high‑availability workloads, full VM lifecycle management, snapshotting, cloning, and performance troubleshooting.
  • Provision infrastructure for modern workloads, including containerized services, AI inference systems, and high‑performance compute where required.
  • Participate in capacity planning and scaling efforts to keep pace with SaaS growth.

Linux Systems Administration

  • Administer production Linux systems (Debian, Ubuntu, CentOS/RHEL, or similar).
  • Operate primarily via CLI with a focus on stability, security hardening, and disciplined patch management.
  • Troubleshoot OS‑level issues impacting performance, reliability, and availability.

Command Line & Automation

  • Write solid shell scripts (bash/zsh) to automate operational tasks and eliminate manual risk.
  • Investigate system behavior using tools like systemctl, journalctl, top/htop, and tcpdump.
  • Automate provisioning, deployment, and lifecycle management for AI services and model endpoints using repeatable, auditable workflows.
  • Improve consistency through automation and clear operational documentation.

Backup, Disaster Recovery & Business Continuity

  • Implement and maintain backup strategies using tooling such as Synology Active Backup, ESXi CBT‑based backups, snapshots, replication, and offsite storage.
  • Regularly validate backups and perform test restores — because untested backups don’t count.
  • Support defined RPO/RTO targets and actively participate in DR testing and reviews.

Monitoring, Alerting & Incident Response

  • Configure and maintain monitoring and alerting systems such as Nagios, Icinga, or comparable platforms.
  • Build meaningful checks, alerts, and dashboards that surface real problems — not noise.
  • Participate in incident response, root cause analysis, and post‑incident improvement cycles.

Networking & Security Fundamentals

  • Manage and troubleshoot IP addressing, routing, VLANs, DNS, DHCP, firewalls, and VPNs.
  • Support secure network segmentation and access controls appropriate for healthcare SaaS environments.
  • Design and maintain secure connectivity for AI services, including private model APIs, data stores, agent tools, and knowledge systems.
  • Support vulnerability remediation, security reviews, and audit readiness efforts.

VoIP & Clinical Communication Systems

  • Configure and maintain VoIP platforms using Asterisk and/or FreePBX.
  • Troubleshoot SIP, call routing, and reliability issues affecting clinical and operational users.

Database Systems Support

  • Install, configure, back up, and maintain MySQL, MariaDB, and/or MongoDB systems.
  • Manage users and permissions, monitor performance, and assist with query optimization.
  • Ensure database recoverability and integrity consistent with healthcare data requirements.


What You Bring

  • Proven experience in systems or network administration within SaaS, healthcare, or other regulated environments.
  • Strong Linux administration skills and deep comfort working in production via the command line.
  • Solid understanding of virtualization, networking, monitoring, and backup/DR best practices.
  • Experience supporting systems that demand high availability, auditability, and data protection.
  • Clear documentation habits and the ability to work effectively across engineering, security, and operations.
  • Working knowledge of AI infrastructure concepts, including model serving, LLM‑based services, vector databases, embeddings, and retrieval workflows.
  • Familiarity with AI agent architectures and service integration patterns such as MCP or similar model‑to‑tool connectivity approaches.
  • Experience operating API‑driven services with strong controls around authentication, secrets management, logging, and sensitive data access.


Bonus Points

  • Hands‑on experience in HIPAA‑regulated or compliance‑driven environments.
  • Familiarity with healthcare platforms like EHR, ePCR, telehealth, or care coordination systems.
  • Experience with automation, configuration management, or infrastructure‑as‑code practices.
  • Exposure to GPU servers, containerized AI workloads, or self‑hosted model‑serving platforms.
  • Familiarity with private LLM deployments, vector search, RAG architectures, or enterprise AI platforms.
  • Understanding of AI governance, prompt and data security, and operational controls in regulated environments.


Similar Jobs

Explore other opportunities that match your interests

IT Support Associate

Networking
11h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Entry level

Nava

United State

IT Support Technician

Networking
12h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Entry level

j-mack technologies, llc

United State
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Entry level

brotherstech

United State

Subscribe our newsletter

New Things Will Always Update Regularly