Python Developer Intern - Web Scraping and Data Acquisition

tony, llc • United State

Remote

Apply

AI Summary

Join Shop ONline New York as a Python Developer Intern to build robust web scrapers and data pipelines. Design, build, and maintain scrapers and API clients to collect structured data from approved sources. Collaborate with analytics, merchandising, and engineering teams to ensure accurate and timely datasets.

Key Highlights

Design and build web scrapers and API clients

Collect structured data from approved sources

Collaborate with analytics, merchandising, and engineering teams

Key Responsibilities

Design, build, and maintain scrapers and API clients

Collect structured data from approved sources

Normalize raw data to a defined schema

Export cleaned datasets to CSV/XLSX and Google Sheets

Technical Skills Required

Python requests BeautifulSoup Selenium Playwright pandas REST APIs Git virtual environments dependency management

Benefits & Perks

Mentorship and code reviews from experienced engineers

Portfolio of production-grade scrapers, API clients, and normalized datasets

Certificate of completion and letter of recommendation

Nice to Have

Use of tenacity/logging libraries

Pydantic for validation

Prefect/Airflow for orchestration

Job Description

Non-Paid Internship for Python Developer

Company: Shop ONline New York

Location: Fully Remote

Duration: 3 months

Number of Hours Per Week: 20

About Shop ONline New York:

We are an e-commerce company preparing for launch. To power pricing intelligence, catalog quality, and market insights, we need reliable data pipelines. This internship is for a -focused web scraping and data acquisition specialist who can build robust collectors, normalize results to a schema, and deliver clean datasets to stakeholders.

Role Overview:

You will design, build, and maintain scrapers and API clients that collect structured data from approved sources. Your work includes pagination, authentication, retries, logging, and error handling, followed by normalization into a defined schema and export to Excel/Google Sheets. You will collaborate with analytics, merchandising, and engineering to ensure datasets are accurate, timely, and usable.

Key Responsibilities:

Build scrapers using requests and BeautifulSoup; use Selenium or Playwright only when browser automation is required.
Integrate REST APIs; handle pagination, rate limits, and authentication (API keys, OAuth) reliably.

Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.

Implement robust controls: retries with backoff, structured logging, exception handling, and idempotent runs.
Normalize raw data to a defined schema and maintain a clear data dictionary with field types and definitions.
Deduplicate records, validate types, and add metadata (source, timestamps, run status) for auditability.
Export cleaned datasets to CSV/XLSX and Google Sheets; prepare concise data notes for recipients.
Schedule and version jobs (cron/GitHub Actions), maintain requirements.txt/poetry, and document setup and usage.
Follow legal, policy, and ethical guidelines: respect robots.txt, terms of service, privacy, and compliance standards.
Collaborate with stakeholders to refine requirements and adjust schemas as business needs evolve.

Qualifications:

Experience building scrapers with requests and BeautifulSoup; ability to use Selenium or Playwright when necessary.
Strong pandas skills for cleaning, joining, reshaping, validation, and export to Excel/Google Sheets.
Comfortable integrating REST APIs with pagination and authentication flows.
Demonstrated ability to design resilient scripts with retries, logging, and error handling.
Experience normalizing data into a predefined schema and maintaining a data dictionary.

Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.

Git proficiency and familiarity with virtual environments and dependency management.
Understanding of ethical scraping practices, robots.txt, and rate limiting.

Nice to Have:

Use of tenacity/logging libraries, Pydantic for validation, or Prefect/Airflow for orchestration.
Basic SQL for loading data into a warehouse; experience with Google Sheets API.
Proxy management and captcha handling where permitted and appropriate.

Exceptional Internship Benefits:

Ship real data pipelines used by leadership for decisions.
Mentorship and code reviews from experienced engineers.
A portfolio of production-grade scrapers, API clients, and normalized datasets.
Certificate of completion and, based on performance, a detailed letter of recommendation.

Job Overview

Posted Date Apr 13, 2026

Employment Type Internship

Experience Level Internship

Location United State

Category Programming

Company tony, llc

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior Director, Product Design

Programming

•

4h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Director

Jobgether

United State

Angular Front-End Developer

Programming

•

4h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

ICF

United State

Senior Director, Digital Strategy & Experience

Programming

•

9h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Associate

Jobgether

United State

Python Developer Intern - Web Scraping and Data Acquisition

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior Director, Product Design

Jobgether

Angular Front-End Developer

Premium Job

ICF

Senior Director, Digital Strategy & Experience

Jobgether

Subscribe our newsletter