23h ago
Member of Technical Staff - Post-Training and RL
Palo Alto, CA
$180k-$600k / year
ai-ml
๐ผ About This Role
You'll work on critical post-training and reinforcement learning challenges, including reward modeling and RLHF. Your core impact will be improving model reasoning, truthfulness, and real-world capabilities. This role offers clarity on your first project before an offer.
๐ฏ What You'll Do
- Work on reward modeling and preference optimization (RLHF/DPO)
- Implement RL for improving reasoning and truthfulness
- Enhance real-world capabilities through post-training techniques
๐ Requirements
- Truth-seeking AI is your most important priority
- Obsessed with building useful models via post-training and RL techniques
- Power user of AI models eager to push boundaries with RL and alignment
โจ Nice to Have
- Previous experience in post-training, RLHF, or trained models used by millions
๐ Benefits & Perks
- ๐๏ธ Base salary $180,000 - $600,000
- ๐ฏ Equity included in total rewards
- ๐ฉบ Comprehensive medical, vision, and dental coverage
- ๐ฆ 401(k) retirement plan
- ๐ก๏ธ Short and long-term disability insurance plus life insurance
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3Onsite Interviewยท 4 hours
0 0 0