Design, implement, and maintain scalable data architectures and pipelines. Lead technical efforts, contribute to architecture decisions, and mentor junior team members. Collaborate with cross-functional teams to develop tailored data solutions.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Job Description
About The Company
Fusemachines is a leading provider of AI strategy, talent, and education services dedicated to democratizing artificial intelligence globally. Established by Sameer Maskey Ph.D., an esteemed Adjunct Associate Professor at Columbia University, the company aims to empower organizations and individuals by making AI accessible and impactful. With a presence in four countries—Nepal, the United States, Canada, and the Dominican Republic—and a diverse team of over 450 employees, Fusemachines leverages its global expertise to transform businesses through innovative AI solutions. The company's mission centers on fostering AI literacy, developing cutting-edge AI talent, and delivering strategic AI consulting to a broad range of industries worldwide.
About The Role
This is a full-time remote position designed for a talented Senior Data Engineer or Technical Lead. The role involves designing, building, testing, optimizing, and maintaining the infrastructure and code necessary for comprehensive data integration, storage, processing, and analytics. The focus is on creating robust data pipelines and analytics solutions, from data ingestion to consumption, ensuring high data quality, accessibility, and security to support business intelligence and advanced analytics initiatives. The ideal candidate will possess a strong programming background and a deep understanding of managing data across various storage systems and cloud platforms. They will lead technical efforts, contribute to architecture decisions, and mentor junior team members, all within an Agile environment. The role also involves contributing to cloud migration projects, particularly transitioning data solutions from AWS to GCP, leveraging expertise in both cloud platforms to optimize performance, cost, and scalability.
Qualifications
- Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field.
- 5+ years of professional experience in data engineering, with proven expertise in AWS and GCP cloud environments.
- Relevant certifications in cloud platforms (preferred but not mandatory).
- Proficiency in Python, SQL, and PySpark for large-scale data processing.
- Experience designing and optimizing data pipelines, data lakes, and data warehouses using tools like Spark, DBT, Kafka, and Airflow.
- Strong programming skills in Python, Scala, and SQL, with knowledge of data modeling and database design.
- Hands-on experience with relational and NoSQL databases such as MySQL, Postgres, Cassandra, and MongoDB.
- Deep understanding of AWS services including Lambda, Kinesis, Redshift, S3, EMR, EC2, IAM, and CloudWatch.
- Experience with orchestration tools like Airflow or Composer, DevOps practices including GitHub, CI/CD pipelines, and Terraform.
- Excellent problem-solving, leadership, and project management capabilities.
- Strong communication skills to collaborate effectively across technical and non-technical teams.
Interested in remote work opportunities in Data Science? Discover Data Science Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- Design, implement, deploy, test, and maintain scalable and efficient data architectures and pipelines.
- Establish and uphold standards and best practices for data management and integration.
- Ensure the scalability, reliability, performance, and quality of data systems.
- Mentor and guide junior and mid-level data engineers to foster team growth and skill development.
- Collaborate with Product, Engineering, Data Science, and Analytics teams to understand data requirements and develop tailored data solutions.
- Evaluate, select, and implement new technologies and tools to enhance data processing capabilities.
- Develop architecture, observability, and testing strategies to ensure data pipeline reliability.
- Manage storage layers, including schema design, indexing, and performance tuning for optimal data access.
- Address complex data engineering issues swiftly, resolving bottlenecks in SQL queries and database operations.
- Conduct discovery and assessment of existing data infrastructure and propose scalable architecture improvements.
- Implement data governance frameworks, including data cataloging, lineage, quality, and compliance practices.
- Define and document data engineering workflows, processes, and data flows for transparency and consistency.
- Participate actively in Agile ceremonies, contributing to continuous improvement initiatives.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Competitive salary package aligned with industry standards.
- Opportunity to work remotely from anywhere, providing flexibility and work-life balance.
- Professional development opportunities, including certifications and training.
- Collaborative and innovative work environment with a diverse team.
- Participation in cutting-edge AI and data projects with global impact.
- Health and wellness benefits (if applicable, based on location).
- Paid time off and holidays to support personal well-being.
Fusemachines is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, veteran status, or any other protected class under applicable laws.
Similar Jobs
Explore other opportunities that match your interests
Wiraa
The Fountain Group