2h ago
Member of Technical Staff – Model Training
Palo Alto, CA
$175,000-$350,000 / year
full-timeseniorArtificial Intelligence Visa Sponsor
Tech Stack
Description
You will design, build, and scale post-training pipelines for large language models, focusing on fine-tuning and preference optimization techniques like RLHF and DPO. Your work will directly improve model reliability, alignment, and cost while collaborating with cross-functional teams to land improvements in production systems.
Requirements
- Hands-on experience training and fine-tuning large transformer models on multi-GPU/multi-node clusters.
- Fluent in PyTorch and ecosystem tools (Torchtune, FSDP, DeepSpeed) with knowledge of distributed-training internals, mixed precision, and memory-efficiency tricks.
- Shipped or published work in RLHF, DPO, GRPO, or RLAIF and understands practical trade-offs.
- Bachelor's degree or equivalent in a related field.
- Communicates crisply with both technical and non-technical teammates.
Responsibilities
- Contribute to end-to-end post-training workflows including dataset curation, hyper-parameter search, evaluation, and rollout using PyTorch, Torchtune, FSDP/DeepSpeed.
- Prototype and compare alignment techniques (e.g., curriculum RL, multi-objective reward modeling, tool-use fine-tuning) and push best ideas into production.
- Automate training at scale by building robust pipeline components, tools, scripts, and dashboards for reproducible experiments.
- Define metrics, run A/B tests, and iterate quickly to meet aggressive quality targets.
- Collaborate with inference, safety, and product teams to land improvements in customer-facing systems.
0 views 0 saves 0 applications