17h ago
Senior Infrastructure Engineer, Lab
San Francisco, CA
$172k-$209k / year
full-timeseniorai-ml
🛠 Tech Stack
💼 About This Role
You'll manage a lab environment with the latest GPU systems, performing diagnosis and troubleshooting of hardware faults. Your work directly supports new product integration and research experimentation for cutting-edge AI infrastructure.
🎯 What You'll Do
- Manage lab environment for latest GPU systems.
- Diagnose and troubleshoot hardware faults in GPU racks.
- Perform component-level repairs and field-replaceable unit (FRU) repairs.
- Conduct post-repair validation and burn-in testing.
📋 Requirements
- Experience diagnosing and repairing high-density rack-mounted compute hardware in production environments.
- Deep understanding of GPU architectures and hands-on experience.
- Experience supporting NVIDIA A100, H200, GB200, B200 and AMD 350X/355X platforms.
- Strong Linux command-line experience for diagnostics and testing.
✨ Nice to Have
- Technical certification or degree in Electrical Engineering, Computer Science, or related field.
- Experience working directly with hardware vendors and escalations.
- Background in large-scale GPU fleet operations or hyperscale data centers.
🎁 Benefits & Perks
- 💰 Competitive compensation with equity and Restricted Stock Units.
- 🏖️ Paid time off, holidays, and parental leave.
- 🏥 Comprehensive health, dental & vision insurance with HSA contributions.
- 📚 Professional development and tuition reimbursement.
- 🍽️ Daily meals allowance and cell phone stipend.
0 0 0