2d ago

Researcher, Alignment Training

San Francisco

$250k-$445k / year

full-timeleadai-ml

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll study how frontier models acquire durable behavioral tendencies across the training stack. You'll define target behaviors, design data and training interventions, and build evaluation loops to determine whether learned behaviors are broad and robust. This role is close to the core training loop for a leading AI research company.

๐ŸŽฏ What You'll Do

  • Develop synthetic data methods for training behavioral tendencies
  • Study how pre-training, mid-training, and post-training shape behavior
  • Build evaluation loops connecting behavior to training data
  • Design reusable data generation and filtering pipelines
  • Create experiments distinguishing durable behavior from artifacts

๐Ÿ“‹ Requirements

  • Record of technically excellent work in large-scale ML
  • Experience with pre-training, post-training, synthetic data, or model evaluation
  • Ability to design experiments with subtle or noisy signal
  • Strong judgment about research questions worth pursuing

โœจ Nice to Have

  • Alignment research background
  • Experience with training infrastructure
  • Cross-functional collaboration skills

๐ŸŽ Benefits & Perks

  • ๐Ÿ’ฐ Competitive salary and equity
  • ๐Ÿ–๏ธ Flexible PTO
  • ๐Ÿฅ Health insurance
  • ๐Ÿ“š Learning and development budget

๐Ÿ“จ Hiring Process

Estimated timeline: 3-5 weeks ยท AI estimate

  1. 1Recruiter Callยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3Research Presentationยท 60 min
  4. 4Hiring Committeeยท 30 min
0 0 0