AI Summary
Transform biomedical imaging data from clinical trials and RWD. Design, implement, and maintain automated pipelines for onboarding, verifying, transforming, and curating biomedical imaging data. Ensure data quality, integrity, and compliance.
Key Highlights
Imaging Data Pipeline Delivery
Data Quality And Integrity
Data Analysis And Integration
Image Data Management
Compliance And Controls
Collaboration
External Collaboration
Lead The Delivery Team
Technical Skills Required
Benefits & Perks
4 days from office
Relocation assistance (candidates must reside in San Francisco, CA, or be open to relocation)
Authorized to work in the United States
Job Description
Location (Onsite from Day 1):
San Francisco, CA (4 days from office). Candidates must reside in San Francisco, CA, or be open to relocation.
Candidates should be authorized to work in the United States.
What's In It For You
The Image Curation and Data Products team transforms biomedical imaging data from clinical trials and RWD by applying tools and workflows to deliver high-quality, FAIR imaging datasets. These enable imaging data scientists to discover and utilize data for exploratory use to algorithm development.
Job Description Key Responsibilities
Imaging Data Pipeline Delivery:
Design, implement and maintain automated pipelines for onboarding, verifying, transforming and curating biomedical imaging data from clinical trials and real-world data sources for therapeutic areas-Oncology, Neurology, Ophthalmology-covering all image file formats.
Data Quality And Integrity
Develop and implement solutions to detect and correct anomalies and inconsistencies to achieve the highest data quality of the imaging dataset per industry standards (DICOM) and internal specifications such as FFS, RTS, GDSR, etc. Ensure de-identification, PHI/PII controls, and image-specific QC checks are implemented at scale.
Data Analysis And Integration
Integrate ML and AI-assisted tools in pipelines for inline image analysis, classifications, segmentations to extract and enrich metadata for various analyses, optimize performance, etc.
Image Data Management
Build and maintain large-scale catalogs of curated imaging datasets enhancing FAIR principles, enabling easy discovery and access to imaging data assets.
Compliance And Controls
Ensure applicable compliance and privacy controls are followed as required by GXP and CSV validations.
Collaboration
Work closely with image scientists, data scientists, clinops, and biomarker research teams, supporting data needs for various primary and secondary endpoint analyses.
External Collaboration
Work with external partners, e.g., CROs, to ensure imaging data received conforms to established agreements, quality standards, and completeness.
Lead The Delivery Team
Ensure timely delivery of product backlog/features.
Agile Participation
Participate with the team and lead various agile ceremonies throughout planning and execution.
Ideal Candidate Would Have (multiple Competencies From List Below)
- Worked with medical imaging data and platforms, PACS, VNAs, etc.
- Worked with radiology imaging data such as CT, PET, MRI, NIfTI, and Ophthalmic imaging OCT, FA, CFP, etc.
- Good understanding of DICOM standards, structure, metadata parsing, tags, multi-frame images.
- Worked with clinical information data standards like SDTM, ADaM.
- Data integration across diverse data sources, e.g., imaging data with tabular clinical data.
- De-identification methodologies, PHI/PII detection and privacy controls.
- Good understanding of GXP and CSV validation frameworks.
- Proficient in Python and libraries such as pandas, pydicom, SimpleITK, dicom-numpy, dcm2niix.
- Hands-on experience with ETL/ELT involving large medical imaging datasets.
- Experience with Apache Airflow, Spark, Talend, or similar workflow orchestration tools.
- Proficiency with SQL and NoSQL and image metadata stores (PostgreSQL, Mongo, etc.).
- Practical experience with AWS infrastructure and Data Services such as RDS, Athena, Glue, EC2, Lambda, S3.
- Familiar with EKS, Docker, and HPC.
- Experience in data analysis and report generation using Tibco, Tableau, AWS QuickSight, etc.
- Good knowledge of Git, GitLab, and DevOps tools like Jenkins, Terraform.
- Familiar with ML workflows for Computer Vision tasks such as segmentation, classification, etc.
- Nice to have: implemented solutions on NLP and GenAI.
- Worked with cross-functional global teams in a dynamic Agile environment.
- Lead and mentor agile team members.
- Has 10+ years of experience with data platforms, analysis, and insights.
Engineering Degree BE/ME/BTech/MTech/BSc/MSc.
Technical certification in multiple technologies is desirable.