AI Summary
mLabs is seeking a Senior Data Engineer to build the foundational data infrastructure for their AI-first fintech platform. The role involves designing, implementing, and governing the core data platform, ensuring high performance and readiness for ML applications.
Key Highlights
Design, implement, and govern the core data platform
Build and consolidate the foundational data platform
Implement robust, resilient batch and real-time data pipelines
Apply expert-level SQL and tools like dbt to build and maintain dimensional models
Implement and enforce strict data governance policies
Technical Skills Required
Benefits & Perks
Competitive salary plus stock options
Flexibility in contract type
Option for Full-Time Employment (FTE through EoR) or B2B Contract
30 days of paid holidays (excluding bank holidays)
Private dental and health insurance (for FTE employees)
100% remote work option available
Job Description
Senior Data Engineer - AI-First Fintech Platform
Location: Remote - Fully Remote - Across Europe
Compensation: $120K - $140K
We are a rapidly growing fintech and crypto platform undergoing an AI-first transformation. We are seeking a strategic, high-impact, and high-ownership Senior Data Engineer to build the foundational data infrastructure that powers our cutting-edge AI features.
You will report directly to the CTO and work closely with our Machine Learning (ML) team to consolidate and govern our data ecosystem (including PostgreSQL, Kafka, BigQuery, and GCS) into a clean, governed, and ML-ready platform. Your contributions will directly enable critical AI functionality across our product suite.
This role is responsible for the end-to-end design, implementation, and governance of our core data platform, ensuring high performance and readiness for ML applications.
- Data Platform Architecture: Build and consolidate the foundational data platform, ensuring data from various sources (PostgreSQL, Kafka) is accurately captured and processed into our cloud data warehouse
- Pipeline Development: Design and implement robust, resilient batch and real-time data pipelines using streaming platforms and Change Data Capture (CDC) tools
- Data Modeling & Transformation: Apply expert-level SQL and tools like dbt to build and maintain dimensional models (fact/dimension tables) optimized for analytics, reporting, and Machine Learning feature creation
- Data Governance & Quality: Implement and enforce strict data governance policies, including PII tagging, column-level security, and access controls. Implement automated data quality monitoring with checks and alerting
- Performance & Optimization: Optimize the performance of our data warehouse (e.g., BigQuery) through techniques like partitioning, clustering, and advanced query optimization
- Observability: Implement and maintain full observability of the data platform, focusing on data freshness monitoring, schema change detection, and pipeline health dashboards
- AI/ML Collaboration: Work closely with the ML team and CTO to structure data specifically to enable and accelerate the development of new AI-driven features
- Cloud Expertise: Strong hands-on experience with Google Cloud Platform (GCP) data services (e.g., BigQuery, Dataflow, Cloud Storage, DataStream, AlloyDB/CloudSQL)
- SQL & Modeling: Expert-level SQL proficiency and significant experience with data modeling tools like dbt for transformation and testing
- Streaming & CDC: Hands-on experience with streaming platforms (Kafka, Kafka Connect) and an understanding of Change Data Capture (CDC) tools (Debezium or similar)
- Dimensional Modeling: Proven experience building dimensional models (fact/dimension tables) for analytics and ML features
- Data Governance: Practical experience implementing data governance measures (PII tagging, security, access controls)
- Data Quality & Observability: Experience with implementing automated data quality monitoring and setting up data observability (freshness, schema changes, health dashboards)
- Performance Tuning: Experience optimizing data warehouse performance (e.g., BigQuery partitioning, clustering, query tuning)
- Experience with Feature Store architecture and understanding ML feature serving patterns (real-time vs. batch)
- Prior work within financial services or regulated data environments
- Familiarity with the Vertex AI ecosystem or Apache Beam/Dataflow transformations
- Knowledge of vector databases or semantic search concepts
- Background collaborating directly with ML/data science teams
We offer flexibility in contract type and a competitive benefits package designed to support a high-performing global team.
- Compensation: Competitive salary plus stock options (equity package)
- Contract Flexibility: Option for Full-Time Employment (FTE through EoR) or B2B Contract (candidate preference)
- Note: Candidates choosing the B2B contract option do not receive paid holiday or insurance
- Health & Wellness: Private dental and health insurance (for FTE employees)
- Time Off: 30 days of paid holidays (excluding bank holidays) (for FTE employees)
- Technology: Provision of the best necessary tech for your role
- Work Location: 100% remote work option available, or the choice to work from the office
- Flexibility: Ability to work abroad for up to 4 months a year
- Impact: A leading position offering huge impact, autonomy, and ownership