11h ago
Senior GPU Infrastructure Engineer
San Francisco, CA
✨ $160k-$220k / yearest.
full-timeseniorai-ml
🛠 Tech Stack
💼 About This Role
You'll build and scale Hyperbolic's GPU Cloud Marketplace, transforming raw GPUs from global suppliers into a programmable, orchestrated pool for AI developers. Your core impact will be crafting the orchestration layer that delivers up to 75% cost savings over traditional cloud providers. You'll work at the cutting edge of cloud infrastructure and multi-tenancy provisioning.
🎯 What You'll Do
- Design and implement bare-metal provisioning workflows including IPMI/Redfish and PXE boot.
- Build GPU scheduling and orchestration with placement strategies for multi-GPU jobs.
- Develop CI/CD pipelines and observability stack for infrastructure automation.
- Deploy and manage storage infrastructure for AI/ML workloads (object, block, distributed file systems).
📋 Requirements
- 5+ years experience in cloud infrastructure or distributed systems in production.
- Deep expertise in bare-metal provisioning (IPMI, Redfish, PXE, BMC management).
- Proficient in GPU scheduling and orchestration with memory and topology management.
- Strong skills in Terraform or Pulumi and infrastructure CI/CD.
✨ Nice to Have
- Experience with high-performance networking (InfiniBand, RoCE).
- Familiarity with distributed storage systems like Ceph, Weka, or VAST Data.
🎁 Benefits & Perks
- 🚀 Cutting-edge AI infrastructure work with a Series A startup.
- 🏢 Collaborative team led by PhD co-founders in AI and CS.
- 🌍 Remote-friendly with San Francisco HQ.
- 📈 Equity compensation.
- 🏥 Health insurance.
0 0 0