3h ago

Member of Technical Staff - Mid-training

Palo Alto, CA

$180k-$440k / year

full-timeArtificial Intelligence

🛠 Tech Stack

💼 About This Role

You'll join a small team focused on scaling synthetic coding data to trillions of tokens and optimizing mid-training data mixtures for flagship AI models. Your work directly impacts the ceiling of reinforcement learning and long-context capabilities. This role offers a flat structure where leadership is earned through initiative.

🎯 What You'll Do

  • Scale synthetic coding data to trillions of tokens with Docker verification
  • Distill flagship model intelligence into flash models via synthetic data
  • Optimize mid-training data mixtures to boost RL ceiling
  • Engineer long-context data recipes and robust evaluations

📋 Requirements

  • Expertise in ML and large model scaling with knowledge of scaling laws
  • Strong ability to design ML experiments
  • Familiarity with state-of-the-art techniques for curating AI training data across modalities
  • Strong engineering abilities in Spark, Ray, and similar frameworks

🎁 Benefits & Perks

  • 💵 Equity in company
  • 🏥 Comprehensive medical, vision, and dental coverage
  • 📈 401(k) retirement plan
  • 🛡️ Short & long-term disability insurance
  • 🧑‍⚕️ Life insurance
0 0 0