We are seeking an experienced AI Engineer with strong Go and Python expertise to design, build, and maintain production-grade LLM-powered systems. The ideal candidate will have a strong focus on reliability, scalability, and performance.
Key Highlights
Technical Skills Required
Benefits & Perks
Job Description
Position: AI Engineer (Go & Python)
Location: Remote
Company: ShelterZoom
Department: Product & Engineering
About ShelterZoom
ShelterZoom is a deep tech company and multi-category innovator in cybersecurity, digital content ownership, and business continuity. With nearly 80 patents and trademarks in its portfolio, ShelterZoom is redefining data ownership, tracking, and protection.
Recognized by Gartner® as a market leader for three consecutive years (2022–2024), ShelterZoom continues to shape the future of digital resilience. From inventing document tokens and Single Source of Truth® (SSOT) technology to building Spare Tire—a life-saving solution for healthcare EHR downtime—and developing safeguards against AI manipulation, ShelterZoom is creating platforms that define the next generation of data ownership, cybersecurity, and artificial intelligence.
Learn more: www.shelterzoom.com
About the Role
We are looking for an AI Engineer with strong Go and Python expertise to build, deploy, and operate production-grade LLM-powered systems.
At ShelterZoom, we bring together high-performing individuals who think creatively, move fast, and push boundaries. If you are ready to unlock new opportunities, create real-world impact, and help shape the future of digital resilience, we’d love to hear from you.
Key Responsibilities:
- Design, build, and maintain production-ready LLM and RAG-based systems with a strong focus on reliability, scalability, and performance
- Architect and implement Retrieval-Augmented Generation (RAG) pipelines, including hybrid search (dense + sparse), re-ranking, and retrieval validation
- Design and manage LLM memory layers, including:
- Short-term memory (context windows)
- Long-term memory (vector databases)
- Structured memory (key-value stores and relational databases)
- Integrate and optimize LLMs and embeddings for real-world use cases, including chunking, indexing, and retrieval strategies
- Develop and orchestrate AI agents, including multi-agent systems (planner–executor, supervisor–worker patterns)
- Implement robust guardrails and hallucination mitigation techniques using validation, constraints, and monitoring
- Build observability and evaluation pipelines for LLM systems (quality, latency, cost, and safety)
- Optimize systems for latency, cost efficiency, and throughput, including async inference, streaming, caching, and prompt compression
- Collaborate closely with product, security, and infrastructure teams to deliver secure, enterprise-grade AI solutions
- Contribute to system design, code reviews, testing, and continuous improvement of engineering best practices
Requirements:
RAG & LLM Engineering- Design and implementation of RAG systems, including:
- Hybrid search (dense + sparse)
- Re-ranking strategies
- LLM memory design:
- Short-term (context window)
- Long-term (vector databases)
- Structured memory (KV / relational databases)
- Embeddings & chunking strategies:
- text-embedding-3, BGE, E5
- Sliding window and semantic chunking
- Evaluation frameworks:
- RAGAS, TruLens, DeepEval
- Experience with agent frameworks:
- LangChain, LangGraph
- MCP (Model Context Protocol)
- Multi-agent orchestration patterns
- Prompt engineering:
- System prompts
- Few-shot learning
- Structured outputs
- Guardrails & hallucination mitigation:
- Retrieval validation
- Output constraints
- Observability & monitoring:
- Langfuse, OpenTelemetry, custom tracing
- Cost optimization:
- Token budgeting
- Caching strategies
- Prompt compression
- Latency optimization:
- Async inference
- Parallel calls
- Streaming responses
- Strong proficiency in Python and Go
- Go runtime internals:
- Concurrency model (goroutines, channels)
- Garbage collection
- Core data structures
- Profiling and performance tuning
- Testing practices
Nice to Have:
- Strong understanding of software design principles:
- SOLID, Clean Architecture, GoF patterns
- Idiomatic Go project layout
- Networking fundamentals:
- TCP/IP and core Internet concepts
- Knowledge Graphs / GraphRAG
- Data systems fundamentals:
- OLTP vs OLAP
- ACID, CAP theorem
- Index structures
- Databases:
- PostgreSQL, MySQL, MongoDB (internal principles and components)
- Infrastructure & DevOps:
- Kubernetes, Prometheus, CI/CD, AWS
- Distributed systems:
- Redis, Elasticsearch, Kafka
- Microservices and high-availability systems
- Exposure to additional technologies beyond the core stack
- Russian language proficiency (nice to have, not required)
Job Details
- Remote Policy: Full Remote
- Contract Type: Full-Time, Contract
Benefits
- Competitive salary
- Learning and development opportunities to support individual growth
- Flexible, remote work with a strong focus on work–life balance and wellbeing
- Open-minded, collaborative, and diverse culture
- Clear career growth path in a company with a one-of-a-kind, category-defining product
- Equal opportunities in a respectful, fair, and socially conscious environment
- A people-first culture built on respect and open communication
- Dynamic and flexible international team
- Opportunity to work on cutting-edge AI-driven cybersecurity and digital resilience platforms with proven market recognition
Similar Jobs
Explore other opportunities that match your interests
Lensa
Steer
Manager, Recruiting Programs and Operations Manager