Software Engineer, Model Performance Tooling at Baseten

15h ago

Software Engineer, Model Performance Tooling

San Francisco

$160k-$200k / year

full-timejuniorai-ml

🛠 Tech Stack

💼 About This Role

You'll build automated performance and diagnostic tools for next-generation AI infrastructure at a leading AI inference platform. You'll measure GPU FLOPS, stress-test clusters, and define benchmarks that ensure production readiness. This role offers deep hardware exposure and high ownership.

🎯 What You'll Do

Run and automate LLM benchmark suites like GSM8K and MMLU.
Create automated acceptance tests for new GPU clusters.
Develop internal GPU-enabled development environments.
Build tools for automated model evaluation and optimization.

📋 Requirements

Python familiarity
Interest in GPU memory subsystems and networking
Automation mindset; scripting repetitive tasks
Desire to understand Transformer math and FLOPs

✨ Nice to Have

C++ familiarity
Experience with NVIDIA Nsight Systems or PyTorch Profiler
Knowledge of quantization or speculative decoding

🎁 Benefits & Perks

💰 Competitive compensation with meaningful equity
🏥 100% medical, dental, vision for employee and dependents
🏖️ Flexible PTO including Winter Break
👶 Paid parental leave
🏦 Company-facilitated 401(k)

Baseten

Baseten Jobs

Other jobs at Baseten

No other jobs found.

0 0 0