ProfitSolv seeks a seasoned AI Data Engineer to build a centralized data platform on AWS, unifying data across the portfolio and powering AI-driven experiences. The ideal candidate has 5+ years of hands-on data engineering experience and expertise in AWS services, dbt, and RAG pipelines.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
ProfitSolv is a SaaS business services provider for the legal and accounting industry. We are looking for an AI Data Engineer to join our growing team!
We are seeking a seasoned AI Data Engineer to support in building a centralized data platform on AWS to unify data across the entire portfolio and power the next generation of AI-driven experiences for our customers.
This is a greenfield building. We have a clearly defined three-phase architecture (Lakehouse standup → database migration → zero-ETL endgame) and executive sponsorship. We’re looking for an engineer who is an equal parts data engineer and AI builder - someone who can write dbt models in the morning and design a RAG pipeline in the afternoon.
What we provide:
- Opportunity to Invest in Your Future. We offer a 401K match.
- Paid Time Off. Enjoy paid time off and paid holidays.
- Great Coverage. Take advantage of health, dental, and vision HSA and FSA policies.
- A Great Team. Collaborate with smart, curious, hardworking individuals.
- Performance Compensation. Be rewarded for your hard work with performance-based merits.
- Remote Work. Want to work from home? No problem!
As an AI Data Engineer, you will:
- Build and maintain a medallion Lakehouse (Bronze/Silver/Gold) on S3 using Apache Iceberg, Glue Data Catalog, and dbt Cloud with the Athena adapter.
- Configure and manage AWS DMS for ongoing CDC from ~1,000 SQL Server instances. Build ECS Fargate tasks for SaaS API ingestion. Orchestrate it all with Amazon MWAA (Airflow).
- Write dbt Cloud models for Bronze → Silver → Gold transforms. Define business metrics in the dbt Semantic Layer so BI tools and AI agents can consume them.
- Manage Redshift Serverless + Spectrum as the read engine for analysts and BI tools. Tune Iceberg table layouts, partitioning, and compaction for performance.
- Implement Lake Formation tag-based governance for multi-product data isolation. Onboard new acquisitions to the platform in weeks, not months.
- Build batch embedding pipelines for legal documents and client records. Manage vector storage in OpenSearch Serverless or pgvector on Aurora.
- Design and ship RAG pipelines: chunking strategies, retrieval ranking, context window management - for legal domain use cases.
- Build MCP servers that expose the dbt Semantic Layer and data platform APIs to AI agents (Claude, internal copilots, customer-facing features).
- Ensure compliance, security, and governance through IAM roles, encryption policies, and metadata cataloging.
- Other duties as assigned.
Interested in remote work opportunities in Data Science? Discover Data Science Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
This position follows established policies and procedures to keep confidential information secure.
A great fit for this position has:
- 5+ years of hands-on data engineering experience, with a strong focus on AWS services such as S3, Glue, Athena, and Redshift (or equivalent platforms).
- Proven experience building and maintaining production-grade data models using dbt (Core or Cloud), including testing, macros, and documentation best practices.
- Experience implementing Change Data Capture (CDC) patterns using tools such as AWS DMS, Debezium, or similar technologies from relational databases.
- Demonstrated ability to design, build, and operate production Airflow DAGs (MWAA or self-hosted environments).
- Hands-on experience developing at least one production-ready Retrieval-Augmented Generation (RAG) pipeline, including data chunking, embedding generation, vector storage, and retrieval mechanisms.
- Strong proficiency in SQL (primary language) and Python (for data pipelines and AI workflows), with working knowledge of TypeScript for MCP server development.
- Experience with infrastructure-as-code tools such as Terraform or equivalent solutions.
- Comfortable working across the full technology stack in a high-autonomy environment, with the ability to make architectural decisions and drive solutions independently.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
Additional Desirable Qualifications
- Experience building MCP servers or similar AI tool-use integrations.
- dbt Cloud Semantic Layer / MetricFlow experience.
- Worked with Apache Iceberg, Delta Lake, or Hudi in production.
- SQL Server → Aurora PostgreSQL migration experience.
- Worked in PE-backed, multi-product, or M&A-heavy environments.
- Legal tech, payments, or professional services domain knowledge.
- Ability to sit for prolonged periods at a desk and work on a computer.
- Must be able to lift up to 15 pounds at times.
- Ability to handle stress
- Ability to meet work deadlines
Our commitment to you: At ProfitSolv, we are committed to being a diverse and inclusive workplace as an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, national origin, protected veteran status, disability status, sexual orientation, gender identity or expression, marital status, genetic information, or any other characteristic protected by law. We embrace a diverse group of backgrounds and experiences to connect with clients, solve problems, and innovate.
Work location: Remote – U.S. only
Similar Jobs
Explore other opportunities that match your interests
kalibri
KBR Careers