ClearScale is seeking an experienced AWS Data & GenAI Engineer to design and implement data-focused and AI-powered solutions. Responsibilities include building data pipelines, architecting cloud solutions, and developing GenAI applications using LLMs. Key requirements include proven AWS data toolset experience, strong Python/PySpark skills, and hands-on GenAI/LLM application development.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Join Our Innovative Team at ClearScale
ClearScale, an AWS Premier Consulting Partner, is at the forefront of cloud-driven innovation. We empower enterprises, mid-sized businesses, and startups across diverse industries like Healthcare, Finance, and Technology to achieve ambitious cloud initiatives. Our expertise spans cloud consulting, architecture, migration, automation, application development, and managed services. Due to rapid growth and high demand, we're seeking a talented and experienced AWS Data & GenAI Engineer to play a key role in delivering data-focused and AI-powered solutions for our clients. If you're passionate about building robust, scalable data solutions and cutting-edge generative AI applications on AWS, we want to hear from you.
What You'll Do
- Design and implement efficient architectures for high-load, enterprise-scale applications and 'big data' pipelines on AWS.
- Lead data migration from various sources into scalable Data Lakes on AWS.
- Orchestrate and build robust ETL/ELT processes to transform and load data into target data marts.
- Implement and manage secure data access controls leveraging AWS Lake Formation.
- Architect and develop real-time data ingestion pipelines to process high-volume streams, detect anomalies, and enable windowed analytics, delivering insights to systems like Elasticsearch.
- Analyze project requirements, define scope, estimate effort, and identify the optimal technology stack and tools.
- Design and implement optimal data architectures and migration strategies on the AWS platform.
- Develop new solution modules, re-architect existing components, and refactor program code for improved performance and scalability.
- Define infrastructure requirements and collaborate with DevOps engineers on provisioning.
- Monitor and analyze data pipeline performance, recommending and implementing necessary infrastructure adjustments.
GenAI & LLM Responsibilities:
- Design, develop, and deploy GenAI-powered prototypes and production applications leveraging Large Language Models (LLMs) such as Claude, GPT, LLaMA, and Gemini.
- Build and optimize RAG (Retrieval-Augmented Generation) pipelines integrating enterprise data sources with LLM capabilities.
- Evaluate, fine-tune, and benchmark LLM models for specific business use cases, balancing cost, latency, and quality.
- Integrate GenAI capabilities into existing data pipelines and client-facing applications using APIs, prompt engineering, and orchestration frameworks (e.g., LangChain, LlamaIndex).
- Stay current with rapidly evolving GenAI tools and actively apply them to accelerate daily engineering workflows.
- Effectively communicate project-related updates and challenges with clients.
- Collaborate closely with internal and external development and analytical teams to deliver high-quality data and AI solutions.
Interested in remote work opportunities in Data Science? Discover Data Science Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
What You'll Bring
Data Engineering Requirements:
- Proven hands-on experience designing efficient architectures for high-load enterprise-scale applications or 'big data' pipelines.
- Deep practical experience utilizing AWS data toolsets, including but not limited to DMS, Glue, DataBrew, EMR, SCT, and AWS Transform.
- Demonstrated experience in implementing end-to-end big data architectures and pipelines on AWS.
- Hands-on experience with message queuing, stream processing technologies, and highly scalable 'big data' stores.
- Advanced knowledge and practical experience working with both SQL and NoSQL databases.
- Proven track record in re-designing and re-architecting large, complex business applications with a focus on data.
- Strong self-management and self-organizational skills, with the ability to drive tasks to completion independently.
GenAI & LLM Requirements:
- Proven experience using GenAI tools (e.g., GitHub Copilot, Claude, ChatGPT) as part of daily engineering workflow to accelerate development, debugging, and documentation.
- Hands-on experience building end-to-end GenAI application prototypes — from data preparation through model integration to user-facing interface.
- Working knowledge of the LLM ecosystem: prompt engineering, embeddings, vector databases (Pinecone, Weaviate, OpenSearch), and orchestration frameworks.
- Familiarity with AWS AI/ML services such as Amazon Bedrock, SageMaker, and Kendra.
Experience with one or more of the following:
- Strong proficiency in Python and PySpark, particularly for developing AWS Glue jobs.
- Expertise with big data tools such as Kafka, Spark, and Hadoop (HDFS, YARN, Tez, Hive, HBase).
- Experience with stream-processing systems like Kinesis Streaming, Spark Streaming, Kafka Streams, and Kinesis Analytics.
- Solid understanding and practical experience with AWS cloud services, including EMR, RDS, MSK, Redshift, DocumentDB, Lambda, ECS, AWS Bedrock, Athena, and Aurora.
- Familiarity with message queue systems such as ActiveMQ, RabbitMQ, and AWS SQS.
- Experience with federated identity services (SSO) like Okta and AWS Cognito.
- Experience with LLM APIs (OpenAI, Anthropic, Google Vertex AI, Meta LLaMA) and model serving/deployment patterns.
- Practical experience with vector databases and semantic search architectures.
- Knowledge of responsible AI practices: guardrails, content filtering, and bias mitigation in LLM applications.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
Ideally, You Also Have
- 5+ years of progressive experience in a Data, Cloud, or Software Engineer role, coupled with a degree in Computer Science, Statistics, Informatics, Information Systems, Mathematics, or a related quantitative field.
- Valid AWS certifications in Data Engineer, Machine Learning Specialty, or AWS Generative AI certifications.
Why You'll Love Building with Us
- Be an integral part of a dynamic and agile team that fosters continuous learning and professional growth.
- Work on challenging and impactful data and GenAI projects for some of the world's leading brands.
- Solve complex and interesting data and AI problems, pushing the boundaries of what's possible on AWS.
- Enjoy a high degree of autonomy and ownership in your work.
- Contribute your expertise and passion to inspire and elevate the team.
Our Commitment to Your Success
- Competitive salary.
- Exceptional opportunities for leadership development within the rapidly expanding cloud and AI industry.
- A collaborative, high-energy, and fully remote work culture.
- Continuous learning and development opportunities to enhance your skills.
- The flexibility and convenience of a 100% distributed workforce – work from anywhere!
Similar Jobs
Explore other opportunities that match your interests
Business Analyst (Technology and System Implementation)
TEKsystems
opareta