15h ago
Software Engineer, Model Performance Tooling
San Francisco
$160k-$200k / year
full-timejuniorai-ml
🛠 Tech Stack
💼 About This Role
You'll build automated performance and diagnostic tools for next-generation AI infrastructure at a leading AI inference platform. You'll measure GPU FLOPS, stress-test clusters, and define benchmarks that ensure production readiness. This role offers deep hardware exposure and high ownership.
🎯 What You'll Do
- Run and automate LLM benchmark suites like GSM8K and MMLU.
- Create automated acceptance tests for new GPU clusters.
- Develop internal GPU-enabled development environments.
- Build tools for automated model evaluation and optimization.
📋 Requirements
- Python familiarity
- Interest in GPU memory subsystems and networking
- Automation mindset; scripting repetitive tasks
- Desire to understand Transformer math and FLOPs
✨ Nice to Have
- C++ familiarity
- Experience with NVIDIA Nsight Systems or PyTorch Profiler
- Knowledge of quantization or speculative decoding
🎁 Benefits & Perks
- 💰 Competitive compensation with meaningful equity
- 🏥 100% medical, dental, vision for employee and dependents
- 🏖️ Flexible PTO including Winter Break
- 👶 Paid parental leave
- 🏦 Company-facilitated 401(k)
0 0 0