AWS Data & GenAI Engineer

ClearScale • Emea
Remote
Apply
AI Summary

ClearScale is seeking an experienced AWS Data & GenAI Engineer to design and implement data-focused and AI-powered solutions. Responsibilities include building data pipelines, architecting cloud solutions, and developing GenAI applications using LLMs. Key requirements include proven AWS data toolset experience, strong Python/PySpark skills, and hands-on GenAI/LLM application development.

Key Highlights
Design and implement efficient architectures for high-load, enterprise-scale applications and 'big data' pipelines on AWS.
Design, develop, and deploy GenAI-powered prototypes and production applications leveraging Large Language Models (LLMs).
Proven hands-on experience with AWS data toolsets and GenAI tools, including prompt engineering and LLM integration.
Key Responsibilities
Design and implement efficient architectures for high-load, enterprise-scale applications and 'big data' pipelines on AWS.
Lead data migration from various sources into scalable Data Lakes on AWS.
Orchestrate and build robust ETL/ELT processes to transform and load data into target data marts.
Implement and manage secure data access controls leveraging AWS Lake Formation.
Architect and develop real-time data ingestion pipelines to process high-volume streams, detect anomalies, and enable windowed analytics, delivering insights to systems like Elasticsearch.
Analyze project requirements, define scope, estimate effort, and identify the optimal technology stack and tools.
Design and implement optimal data architectures and migration strategies on the AWS platform.
Develop new solution modules, re-architect existing components, and refactor program code for improved performance and scalability.
Define infrastructure requirements and collaborate with DevOps engineers on provisioning.
Monitor and analyze data pipeline performance, recommending and implementing necessary infrastructure adjustments.
Design, develop, and deploy GenAI-powered prototypes and production applications leveraging Large Language Models (LLMs) such as Claude, GPT, LLaMA, and Gemini.
Build and optimize RAG (Retrieval-Augmented Generation) pipelines integrating enterprise data sources with LLM capabilities.
Evaluate, fine-tune, and benchmark LLM models for specific business use cases, balancing cost, latency, and quality.
Integrate GenAI capabilities into existing data pipelines and client-facing applications using APIs, prompt engineering, and orchestration frameworks (e.g., LangChain, LlamaIndex).
Stay current with rapidly evolving GenAI tools and actively apply them to accelerate daily engineering workflows.
Effectively communicate project-related updates and challenges with clients.
Collaborate closely with internal and external development and analytical teams to deliver high-quality data and AI solutions.
Technical Skills Required
AWS DMS AWS Glue AWS DataBrew AWS EMR AWS SCT AWS Transform Python PySpark Kafka Spark Hadoop HDFS YARN Tez Hive HBase Kinesis Streaming Spark Streaming Kafka Streams Kinesis Analytics AWS RDS AWS MSK AWS Redshift AWS DocumentDB AWS Lambda AWS ECS AWS Bedrock AWS Athena AWS Aurora ActiveMQ RabbitMQ AWS SQS Okta AWS Cognito OpenAI API Anthropic API Google Vertex AI API Meta LLaMA API Pinecone Weaviate OpenSearch LangChain LlamaIndex GitHub Copilot Claude ChatGPT GPT LLaMA Gemini AWS Lake Formation Elasticsearch
Benefits & Perks
Competitive salary
Exceptional opportunities for leadership development
100% distributed workforce – work from anywhere!
Nice to Have
5+ years of progressive experience in a Data, Cloud, or Software Engineer role, coupled with a degree in Computer Science, Statistics, Informatics, Information Systems, Mathematics, or a related quantitative field.
Valid AWS certifications in Data Engineer, Machine Learning Specialty, or AWS Generative AI certifications.

Job Description


Join Our Innovative Team at ClearScale

ClearScale, an AWS Premier Consulting Partner, is at the forefront of cloud-driven innovation. We empower enterprises, mid-sized businesses, and startups across diverse industries like Healthcare, Finance, and Technology to achieve ambitious cloud initiatives. Our expertise spans cloud consulting, architecture, migration, automation, application development, and managed services. Due to rapid growth and high demand, we're seeking a talented and experienced AWS Data & GenAI Engineer to play a key role in delivering data-focused and AI-powered solutions for our clients. If you're passionate about building robust, scalable data solutions and cutting-edge generative AI applications on AWS, we want to hear from you.


What You'll Do

  • Design and implement efficient architectures for high-load, enterprise-scale applications and 'big data' pipelines on AWS.
  • Lead data migration from various sources into scalable Data Lakes on AWS.
  • Orchestrate and build robust ETL/ELT processes to transform and load data into target data marts.
  • Implement and manage secure data access controls leveraging AWS Lake Formation.
  • Architect and develop real-time data ingestion pipelines to process high-volume streams, detect anomalies, and enable windowed analytics, delivering insights to systems like Elasticsearch.
  • Analyze project requirements, define scope, estimate effort, and identify the optimal technology stack and tools.
  • Design and implement optimal data architectures and migration strategies on the AWS platform.
  • Develop new solution modules, re-architect existing components, and refactor program code for improved performance and scalability.
  • Define infrastructure requirements and collaborate with DevOps engineers on provisioning.
  • Monitor and analyze data pipeline performance, recommending and implementing necessary infrastructure adjustments.


GenAI & LLM Responsibilities:

  • Design, develop, and deploy GenAI-powered prototypes and production applications leveraging Large Language Models (LLMs) such as Claude, GPT, LLaMA, and Gemini.
  • Build and optimize RAG (Retrieval-Augmented Generation) pipelines integrating enterprise data sources with LLM capabilities.
  • Evaluate, fine-tune, and benchmark LLM models for specific business use cases, balancing cost, latency, and quality.
  • Integrate GenAI capabilities into existing data pipelines and client-facing applications using APIs, prompt engineering, and orchestration frameworks (e.g., LangChain, LlamaIndex).
  • Stay current with rapidly evolving GenAI tools and actively apply them to accelerate daily engineering workflows.
  • Effectively communicate project-related updates and challenges with clients.
  • Collaborate closely with internal and external development and analytical teams to deliver high-quality data and AI solutions.


What You'll Bring

Data Engineering Requirements:

  • Proven hands-on experience designing efficient architectures for high-load enterprise-scale applications or 'big data' pipelines.
  • Deep practical experience utilizing AWS data toolsets, including but not limited to DMS, Glue, DataBrew, EMR, SCT, and AWS Transform.
  • Demonstrated experience in implementing end-to-end big data architectures and pipelines on AWS.
  • Hands-on experience with message queuing, stream processing technologies, and highly scalable 'big data' stores.
  • Advanced knowledge and practical experience working with both SQL and NoSQL databases.
  • Proven track record in re-designing and re-architecting large, complex business applications with a focus on data.
  • Strong self-management and self-organizational skills, with the ability to drive tasks to completion independently.

GenAI & LLM Requirements:

  • Proven experience using GenAI tools (e.g., GitHub Copilot, Claude, ChatGPT) as part of daily engineering workflow to accelerate development, debugging, and documentation.
  • Hands-on experience building end-to-end GenAI application prototypes — from data preparation through model integration to user-facing interface.
  • Working knowledge of the LLM ecosystem: prompt engineering, embeddings, vector databases (Pinecone, Weaviate, OpenSearch), and orchestration frameworks.
  • Familiarity with AWS AI/ML services such as Amazon Bedrock, SageMaker, and Kendra.


Experience with one or more of the following:

  • Strong proficiency in Python and PySpark, particularly for developing AWS Glue jobs.
  • Expertise with big data tools such as Kafka, Spark, and Hadoop (HDFS, YARN, Tez, Hive, HBase).
  • Experience with stream-processing systems like Kinesis Streaming, Spark Streaming, Kafka Streams, and Kinesis Analytics.
  • Solid understanding and practical experience with AWS cloud services, including EMR, RDS, MSK, Redshift, DocumentDB, Lambda, ECS, AWS Bedrock, Athena, and Aurora.
  • Familiarity with message queue systems such as ActiveMQ, RabbitMQ, and AWS SQS.
  • Experience with federated identity services (SSO) like Okta and AWS Cognito.
  • Experience with LLM APIs (OpenAI, Anthropic, Google Vertex AI, Meta LLaMA) and model serving/deployment patterns.
  • Practical experience with vector databases and semantic search architectures.
  • Knowledge of responsible AI practices: guardrails, content filtering, and bias mitigation in LLM applications.


Ideally, You Also Have

  • 5+ years of progressive experience in a Data, Cloud, or Software Engineer role, coupled with a degree in Computer Science, Statistics, Informatics, Information Systems, Mathematics, or a related quantitative field.
  • Valid AWS certifications in Data Engineer, Machine Learning Specialty, or AWS Generative AI certifications.


Why You'll Love Building with Us

  • Be an integral part of a dynamic and agile team that fosters continuous learning and professional growth.
  • Work on challenging and impactful data and GenAI projects for some of the world's leading brands.
  • Solve complex and interesting data and AI problems, pushing the boundaries of what's possible on AWS.
  • Enjoy a high degree of autonomy and ownership in your work.
  • Contribute your expertise and passion to inspire and elevate the team.


Our Commitment to Your Success

  • Competitive salary.
  • Exceptional opportunities for leadership development within the rapidly expanding cloud and AI industry.
  • A collaborative, high-energy, and fully remote work culture.
  • Continuous learning and development opportunities to enhance your skills.
  • The flexibility and convenience of a 100% distributed workforce – work from anywhere!


Similar Jobs

Explore other opportunities that match your interests

Business Analyst (Technology and System Implementation)

Data Science
•
6h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

TEKsystems

United State

Senior Data Professional

Data Science
•
6h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

opareta

Kenya

AI-Driven Business Intelligence Intern

Data Science
•
10h ago
Visa Sponsorship Relocation Remote
Job Type Internship
Experience Level Internship

scaleup.agency

Austria

Subscribe our newsletter

New Things Will Always Update Regularly