19h ago

Research, Post-Training

San Francisco Bay Area

$200k-$350k / yearest.

full-timeseniorai-ml

💼 About This Role

You'll iterate on post-training recipes and evaluation design to shape how AI agents learn and behave. Your work directly determines what Devin and future systems can do in real-world tasks. This role blends deep research and hands-on engineering in a small, talent-dense team.

🎯 What You'll Do

  • Develop post-training recipes and iterate on datasets and hyperparameters.
  • Design and build evaluations that capture real-world performance.
  • Investigate and understand unexpected training results.
  • Apply alignment techniques like RLHF to shape agent behavior.

📋 Requirements

  • Track record in post-training or alignment methods like RLHF.
  • Strong fundamentals in probability, statistics, and ML theory.
  • Evidence of original contributions: publications, open-source, or industry results.
  • Experience with large-scale distributed training and debugging.

✨ Nice to Have

  • Systems-level thinking of training pipelines and evaluation.
  • Comfort with fast-moving research environments.

🎁 Benefits & Perks

  • 🚀 Large GPU allocations from day one with thousands of GPUs.
  • 💡 Fast deployment of prototypes to real products.
  • 👥 Small, highly selective team with minimal process.
0 0 0