Director of Machine Learning - AI Powered Biological Design
Lead the development of AI-powered biological design capabilities, overseeing machine learning strategy and execution across multiple project teams. Collaborate with experimental biologists and computational scientists to advance de novo biological sequence-to-function design capabilities. Develop and implement machine learning models for regulatory element and other DNA, RNA, and protein design.
Key Highlights
Technical Skills Required
Benefits & Perks
Job Description
Director of Machine Learning โ AI Powered Biological Design
The mission of the Allen Institute is to unlock the complexities of bioscience and advance our knowledge to improve human health. Using an open science, multi-scale, team-oriented approach, the Allen Institute focuses on accelerating foundational research, developing standards and models, and cultivating new ideas to make a broad, transformational impact on science.
Join our artificial intelligence powered lab, an initiative at the intersection of academic creativity and start-up style execution. Our mission is to apply machine learning to biological design. Join us as we build a series of interconnected design-test-loop โflywheelsโ that enable design of synthetic enhancers, protein binders, and more.
We are looking for a Head of Machine Learning to lead ML strategy and execution across three tightly integrated project teams building biological DBTL โflywheelsโ for AI model development. The goal of the overall program is to advance de novo biological sequence-to-function design capabilities. An initial flywheel is focused on designing regulatory elements (enhancers) that drive user-specified gene expression across all mammalian cell types. We envision at least three flywheels operating continuously, spanning DNA, RNA, and protein design. In addition to building dedicated models for specific tasks like enhancer design, we aim over time to integrate multiple specialized models into more generalized sequence-to-function design capabilities. Situated at the interface of experimental biology and computation, you will report into the Seattle Hub for Synthetic Biology (SeaHub) administrative unit. You will own the ML vision and execution for this program, supervising ML/data scientists and collaborating with project team leads as we learn how to harness a portfolio of flywheels to maximally accelerate biological design.
At the Allen Institute, we believe that science is for everyone โ and should be open to everyone. We are dedicated to combating biases and reducing barriers to STEM careers more broadly.
We also believe that science is better when it includes different perspectives and voices. We strive to make the Allen Institute a place where everyone feels like they belong and are empowered to do their best work in a supportive environment.
We are an equal-opportunity employer and strongly encourage people from all backgrounds to apply for our open positions.
Essential Functions
- In partnership with the Executive Director and in collaboration with the Allen Technology Office, define and own the ML strategy for the enhancer flywheel and additional synthetic biology flywheels, including success metrics and roadmaps
- Build and manage a central ML team, plus ML/data scientists embedded in project teams
- Architect and implement sequence-to-function and generative models for regulatory element and other DNA, RNA, and protein design, leveraging state-of-the-art architectures (CNNs, transformers, diffusion, etc.)
- Design and optimize DBTL loops via collaboration with project teams, e.g., supporting assay design, active learning tactics, assay configuration, and benchmarking
- Supervise quantitative analysis and QC of high-throughput assays (e.g., MPRA, single-cell data), integrating external datasets such as scATAC-seq and RNA-seq for transfer learning
- Prioritize projects based on organizational goals, collaborating cross-functionally to ensure timely, high-quality delivery
- Establish ML best practices across projects (code quality, experiment tracking, model and data versioning, documentation, reproducibility)
- Partner with data/engineering teams in the Office of the CTO to define and maintain the computational infrastructure required for large-scale sequence modeling and genomics data integration
- Serve as the primary program ML representative, clearly communicating strategy, trade-offs, and results to project leads, leadership, and external collaborators, and contributing to publications and presentations
- Propose and develop ML partnerships across academia, biotech, non-profits, and industry in support of our mission
Required Education And Experience
- Ph.D. in Computer Science, Computational Biology, Statistics, Physics, or related field; or equivalent combination of degree and experience
- 5+ years of post-Ph.D. (or equivalent) experience building, training, and deploying ML models in a research or product environment
- Deep expertise in ML applied to biological sequences or structured biological data (e.g., regulatory genomics, transcriptional modeling, protein/DNA design)
- Strong proficiency in Python and at least one modern ML framework (e.g., PyTorch, JAX, or TensorFlow)
- Proven track record of technical leadership: mentoring scientists/engineers, setting standards, and delivering complex ML systems
- Excellent communication skills and ability to collaborate effectively with both computational and experimental scientists
- Demonstrated experience integrating diverse datasets (e.g., ATAC-seq, RNA-seq, single-cell data) into predictive or generative models
- Research experience in regulatory genomics, enhancers/promoters, transcription factor binding, or MPRA-based model training
- Experience with AI-driven protein design tools such as RFdiffusion, ProteinMPNN, or comparable workflows
- Hands-on work with DBTL loops in synthetic biology, including active learning, experiment selection, or closed-loop optimization
- Experience with generative models for biological sequences (e.g., autoregressive, VAE, diffusion, RL-based sequence design)
- Prior experience leading ML efforts in small, fast-moving, or start-up-style research environments
- Strong publication or open-source record in ML for biology, sequence modeling, or synthetic biology
- Fine motor movements in fingers/hands to operate computers and other office equipment
- This role is currently working onsite and is expected to work onsite four days/week. The primary work location for this role is 700 Dexter Ave N., with the flexibility to work remotely on a limited basis. We are a Washington State employer, and any remote work must be performed in Washington State.
- Attendance and participation in national and international conferences as appropriate
- **Please note, this opportunity offersrelocation assistance**
- **Please note, this opportunity may offer work visa sponsorship**
- $224,200 - $294,250 *
- Final salary depends on the required education for the role, experience, level of skills relevant to the role, and work location, where applicable.
- Employees (and their families) are eligible to enroll in benefits per eligibility rules outlined in the Allen Instituteโs Benefits Guide. These benefits include medical, dental, vision, and basic life insurance. Employees are also eligible to enroll in the Allen Instituteโs 401k plan. Paid time off is also available as outlined in the Allen Institutes Benefits Guide. Details on the Allen Instituteโs benefits offering are located at the following link to the Benefits Guide: https://alleninstitute.org/careers/benefits .
Similar Jobs
Explore other opportunities that match your interests
Machine Learning Engineer - Ad Platforms
apple
Bright Vision Technologies
Principal Architect - Navigation (AI/ML)