2d ago
Researcher, Alignment Training
San Francisco
$250k-$445k / year
full-timeleadai-ml
๐ Tech Stack
๐ผ About This Role
You'll study how frontier models acquire durable behavioral tendencies across the training stack. You'll define target behaviors, design data and training interventions, and build evaluation loops to determine whether learned behaviors are broad and robust. This role is close to the core training loop for a leading AI research company.
๐ฏ What You'll Do
- Develop synthetic data methods for training behavioral tendencies
- Study how pre-training, mid-training, and post-training shape behavior
- Build evaluation loops connecting behavior to training data
- Design reusable data generation and filtering pipelines
- Create experiments distinguishing durable behavior from artifacts
๐ Requirements
- Record of technically excellent work in large-scale ML
- Experience with pre-training, post-training, synthetic data, or model evaluation
- Ability to design experiments with subtle or noisy signal
- Strong judgment about research questions worth pursuing
โจ Nice to Have
- Alignment research background
- Experience with training infrastructure
- Cross-functional collaboration skills
๐ Benefits & Perks
- ๐ฐ Competitive salary and equity
- ๐๏ธ Flexible PTO
- ๐ฅ Health insurance
- ๐ Learning and development budget
๐จ Hiring Process
Estimated timeline: 3-5 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Interviewยท 60 min
- 3Research Presentationยท 60 min
- 4Hiring Committeeยท 30 min
0 0 0