Design, develop, and maintain scalable data pipelines and ELT processes across Azure and Google Cloud Platform environments. Implement and optimize data solutions using Python and PySpark. Collaborate closely with cross-functional teams and P&G stakeholders to deliver reliable, high-quality data solutions.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Before you apply, please get familiar with Luxoft:
- Luxoft locations: https://career.luxoft.com/locations/
- Logeek Magazine: https://career.luxoft.com/logeek-magazine/
- Luxoft Alumni Club: https://career.luxoft.com/alumni/
Mandatory Skills Description:
- Strong proficiency in Python and PySpark, with hands-on experience in production-grade data solutions.
- Solid knowledge of Data Engineering concepts, including object storage, data pipelines, ELT processes, and data modeling.
- Hands-on experience with cloud-based data platforms on Microsoft Azure and Google Cloud Platform.
- Practical experience with tools and services such as Azure Databricks, Azure Blob Storage, Azure SQL Server, BigQuery, DataProc, Airflow, and GCP Buckets.
- Strong understanding of Software Engineering best practices, including:
- Continuous Integration and Continuous Deployment (CI/CD)
- Unit testing, test coverage, and code quality standards
- Linting and static code analysis
- Version Control Systems (Git)
- Experience with GitHub and GitHub Actions for source control and CI/CD automation.
- Understanding of the Software Development Lifecycle (SDLC), including requirements definition, code reviews, release engineering, and deployments.
- Ability to work with enterprise-scale data, including sensitive datasets such as Point of Sale (POS) data, while adhering to security and compliance requirements.
- Familiarity with working in enterprise cloud environments, including access via virtual machines and secure endpoints.
- Ability to work effectively in a distributed team environment and collaborate with multiple stakeholders.
- Availability to work in the same time zone as the P&G team or ensure a minimum of 4-hour daily overlap.
Interested in remote work opportunities in Data Science? Discover Data Science Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
Project Description:
- The project focuses on the development, maintenance, and enhancement of a cloud-based data platform supporting Procter & Gamble analytics and data processing use cases. The platform operates across Microsoft Azure and Google Cloud Platform, enabling scalable data ingestion, transformation, storage, and analytics for enterprise and Point of Sale (POS) data.
- The engineer will be part of a data engineering team responsible for designing and implementing robust data pipelines and ELT processes, building data models, and ensuring high-quality, reliable data delivery. The role requires strong software engineering practices, including CI/CD, automated testing, code quality standards, and version control.
- On the Azure side, the solution leverages Azure Databricks, Azure SQL Server, Azure Machine Learning, and Azure Blob Storage, primarily using Python and PySpark. On the Google Cloud Platform, the stack includes BigQuery, DataProc, Airflow, and GCP Buckets, also with a strong emphasis on Python and PySpark development.
- The engineer will actively contribute throughout the software development lifecycle, including requirements definition, code reviews, deployment, release engineering, and operational support. CI/CD pipelines are implemented using GitHub and GitHub Actions, following enterprise security and compliance standards.
- The role requires compliance with P&G IT policies, completion of mandatory P&G training, and authorization to work with POS (Point of Sale) data. The engineer must have access to both Azure and GCP P&G cloud environments, including virtual machines with connectivity to PGI endpoints. A MacOS or Linux development environment is preferred.
- From a collaboration perspective, the role works closely with P&G stakeholders and distributed teams. Candidates located in Latin America are preferred, with Spanish language skills considered a plus. The engineer should operate in the same time zone as the P&G team, although a minimum 4-hour overlap may be negotiated for candidates from other regions.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
Responsibilities:
- Design, develop, and maintain scalable data pipelines and ELT processes across Azure and Google Cloud Platform environments.
- Implement and optimize data solutions using Python and PySpark on platforms such as Azure Databricks, BigQuery, DataProc, and Airflow.
- Develop and maintain data models to support analytics, reporting, and downstream data consumers.
- Ensure high code quality by applying software engineering best practices, including unit testing, code coverage, linting, and peer code reviews.
- Build, maintain, and enhance CI/CD pipelines using GitHub and GitHub Actions.
- Participate in the full software development lifecycle, including requirements analysis, design, implementation, deployment, and release management.
- Collaborate closely with cross-functional teams and P&G stakeholders to deliver reliable, high-quality data solutions.
- Monitor, troubleshoot, and optimize data workflows to ensure performance, scalability, and reliability.
- Adhere to P&G IT policies, security standards, and compliance requirements, including working with POS (Point of Sale) data.
- Contribute to documentation, knowledge sharing, and continuous improvement of engineering processes and standards.
- Support cloud infrastructure usage on Azure and GCP, including working with virtual machines and secure access to PGI endpoints.
- Actively identify opportunities for automation, process improvements, and reduction of operational defects.
Similar Jobs
Explore other opportunities that match your interests
itransition group
Blue Bridge People