17h ago

Research Engineer, ML Systems

Redwood City, CA

$225k-$400k / year

full-timeai-ml

🛠 Tech Stack

💼 About This Role

You'll join the ML Systems team to optimize GPU clusters and develop systems for training and inference of consumer AI models. You'll impact millions of users daily by improving latency, efficiency, and model performance. Work on projects like writing efficient Triton kernels and building scalable RLHF stacks.

🎯 What You'll Do

  • Write efficient Triton kernels tuned for specific models and hardware
  • Develop prefix-aware routing algorithms to improve serving cache hit rate
  • Train and distill LLMs to improve latency while preserving accuracy
  • Build efficient and scalable distributed RLHF stack

📋 Requirements

  • PhD or equivalent research experience
  • Strong understanding of modern ML techniques (reinforcement learning, transformers)
  • Track record of exceptional research or creative ML systems projects
  • Comfortable writing model development code in PyTorch

✨ Nice to Have

  • Experience training large models in a distributed setting with PyTorch distributed, DeepSpeed, or Megatron
  • Experience with GPUs and collectives, writing kernels in Triton, CUDA, or CUTLASS
  • Familiarity with LLM inference systems like vLLM and FlashAttention

🎁 Benefits & Perks

  • 💰 Competitive compensation with equity
  • 🏖️ Generous PTO
  • 🩺 Health, dental, and vision insurance
  • 🚀 Work on cutting-edge AI
0 0 0