ML Research Engineer - Training at Achira — CareerPair

6 days ago

ML Research Engineer - Training

San Francisco, California, United States

$164,638-$259,000 / year

full-timeseniorDrug Discovery

Tech Stack

Description

You will work at the intersection of cutting-edge machine learning and rigorous research workflows to design and scale intelligent training systems for atomistic simulation models. Your role involves building foundations for training these models at scale, diving into architecture, data, optimizers, and representation learning to unlock their full potential. You'll help invent playbooks for pretraining foundation simulation models and contribute to transforming drug discovery through pioneering models that simulate the physical world with unprecedented speed and fidelity.

Requirements

Experience working closely with ML researchers to turn scientific goals into engineering execution
Designed training workflows that enable fast scaling, tracking results, and troubleshooting failures
Knowledge of training aspects like learning rates, batch norms, weight initializations, and optimizer schedules
Fluency in PyTorch and comfort with distributed cloud setups such as multi-node, multi-GPU
Pragmatic DevOps skills, including familiarity with k8s, SLURM, or similar infrastructure
Energized by uncharted problems and motivated to define new best practices in training world models
Sense of relentless urgency and natural collaboration with a focus on team success
Desire to work in a well-funded, bold, talent-dense organization on transformational impact

Responsibilities

Scale FSM training by developing next-generation training pipelines for deep simulation models
Map strategy by defining and iterating on short-, medium-, and long-term training strategies
Engineer metrics by building robust training diagnostics and interpretability tools
Debug at depth by diagnosing training failures and designing resilient, reproducible workflows
Tune architectures by shaping and adapting them for improved training dynamics and performance
Explore representations by investigating representation learning in the molecular data domain
Automate workflows using generative coding tools to accelerate and automate processes

0 views 0 saves 0 applications