23h ago

Member of Technical Staff - Post-Training and RL

Palo Alto, CA

$180k-$600k / year

ai-ml

๐Ÿ’ผ About This Role

You'll work on critical post-training and reinforcement learning challenges, including reward modeling and RLHF. Your core impact will be improving model reasoning, truthfulness, and real-world capabilities. This role offers clarity on your first project before an offer.

๐ŸŽฏ What You'll Do

  • Work on reward modeling and preference optimization (RLHF/DPO)
  • Implement RL for improving reasoning and truthfulness
  • Enhance real-world capabilities through post-training techniques

๐Ÿ“‹ Requirements

  • Truth-seeking AI is your most important priority
  • Obsessed with building useful models via post-training and RL techniques
  • Power user of AI models eager to push boundaries with RL and alignment

โœจ Nice to Have

  • Previous experience in post-training, RLHF, or trained models used by millions

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Base salary $180,000 - $600,000
  • ๐ŸŽฏ Equity included in total rewards
  • ๐Ÿฉบ Comprehensive medical, vision, and dental coverage
  • ๐Ÿฆ 401(k) retirement plan
  • ๐Ÿ›ก๏ธ Short and long-term disability insurance plus life insurance

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3Onsite Interviewยท 4 hours
0 0 0