15h ago

Member of Technical Staff, Training Infra Engineer

Paris

$200k-$300k / yearest.

full-timesenior Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll design and write high-performant software for training large-scale AI models and improve the training infrastructure for frontier models. You'll bridge research and production by implementing tools to speed up training cycles. You'll work with one of the highest compute-to-engineer ratios in the industry.

🎯 What You'll Do

  • Design and write high-performant scalable software for training.
  • Improve training setup from infrastructure and codebase performance.
  • Craft tools to speed up training cycles and improve infrastructure efficacy.
  • Research and experiment with ideas on supercompute and data infrastructure.

📋 Requirements

  • Extremely strong software engineering skills.
  • Proficiency in Python and ML frameworks like JAX, PyTorch, XLA/MLIR.
  • Experience with distributed training infrastructures (Kubernetes, Slurm, Ray).
  • Hands-on experience training large models at scale and contributing to tooling.

✨ Nice to Have

  • Paper at top-tier venues (NeurIPS, ICML, ICLR, etc.).

🎁 Benefits & Perks

  • 🤝 Open and inclusive culture and work environment
  • 🧑‍💻 Work with cutting-edge AI research team
  • 🍽 Weekly lunch stipend and in-office meals
  • 🦷 Full health and dental benefits with mental health budget
  • ✈️ 6 weeks vacation (30 working days)
0 0 0