ML Scientist, Biomolecular Foundation Models
The Story Behind the Role
What makes this role a unique opportunity at the intersection of AI and biology?
This isn't just another ML role. It's a chance to be at the absolute frontier of generative AI for drug discovery. You'll be building the foundational models that could design the next generation of life-saving therapies. It requires a deep passion for both machine learning and the biological sciences, and offers the chance to make a tangible impact on human health.
FL117, a venture-back stealth AI x bio startup, is seeking a highly skilled ML Scientist. Here you will build out the architectures, training pipelines and overall development of next-gen multi-modal generative models for biomolecular design. We are looking for those excited to work on ambitious projects in a fast-paced environment to push the frontier of AI-powered drug discovery.
What You'll Do
- Spearhead the design and development of cutting-edge deep learning models aimed at advancing generative biomolecular design.
- Design novel representation and tokenization of biomolecules to enable more efficient transfer learning.
- Own and establish a robust research infrastructure.
What You'll Need
- MSc or PhD in Computer Science, AI/ML, Physics, Biophysics, or a related field.
- 2+ years of internship/full-time industry or postdoctoral ML research experience.
- 2+ years of experience developing production-grade machine learning solutions.
- Experience training large-scale generative (transformer/diffusion) models on HPC or multi-GPU, multi-node distributed systems.
- Proficiency in Python and relevant ML libraries (PyTorch or TensorFlow).
- Proven ability to lead technical initiatives, communicate progress, and drive innovation in generative biomolecular design.
- **(Preferred)** Contributed to open-source data projects in biology/chemistry.
- **(Preferred)** Strong presence in the ML community through publications and conference proceedings.
- **(Preferred)** Knowledge of biological data formats (FASTA, SDF, PDBx/mmCIF) and domain-specific ontologies/metadata standards.
- **(Preferred)** Experience with models like ESM, DiffDock, ProteinMPNN, OpenFold, etc.
- **(Preferred)** Familiarity with ML software development and cloud services (Docker, AWS Batch, Step Functions, EKS, AWS ParallelCluster, etc.).
- **(Preferred)** Experience with libraries like: rdkit, rosetta, prody, biopython, pymol, openmm, dagster, prefect, etc.
Application Insights
Application is via a Greenhouse job board. The company notes: 'If you have some of the experience listed but not all, please apply anyway.'