15h ago

Software Engineer, Model Performance Tooling

San Francisco

$160k-$200k / year

full-timejuniorai-ml

🛠 Tech Stack

💼 About This Role

You'll build automated performance and diagnostic tools for next-generation AI infrastructure at a leading AI inference platform. You'll measure GPU FLOPS, stress-test clusters, and define benchmarks that ensure production readiness. This role offers deep hardware exposure and high ownership.

🎯 What You'll Do

  • Run and automate LLM benchmark suites like GSM8K and MMLU.
  • Create automated acceptance tests for new GPU clusters.
  • Develop internal GPU-enabled development environments.
  • Build tools for automated model evaluation and optimization.

📋 Requirements

  • Python familiarity
  • Interest in GPU memory subsystems and networking
  • Automation mindset; scripting repetitive tasks
  • Desire to understand Transformer math and FLOPs

✨ Nice to Have

  • C++ familiarity
  • Experience with NVIDIA Nsight Systems or PyTorch Profiler
  • Knowledge of quantization or speculative decoding

🎁 Benefits & Perks

  • 💰 Competitive compensation with meaningful equity
  • 🏥 100% medical, dental, vision for employee and dependents
  • 🏖️ Flexible PTO including Winter Break
  • 👶 Paid parental leave
  • 🏦 Company-facilitated 401(k)
0 0 0