11h ago

Senior GPU Infrastructure Engineer

San Francisco, CA

$160k-$220k / yearest.

full-timeseniorai-ml

🛠 Tech Stack

💼 About This Role

You'll build and scale Hyperbolic's GPU Cloud Marketplace, transforming raw GPUs from global suppliers into a programmable, orchestrated pool for AI developers. Your core impact will be crafting the orchestration layer that delivers up to 75% cost savings over traditional cloud providers. You'll work at the cutting edge of cloud infrastructure and multi-tenancy provisioning.

🎯 What You'll Do

  • Design and implement bare-metal provisioning workflows including IPMI/Redfish and PXE boot.
  • Build GPU scheduling and orchestration with placement strategies for multi-GPU jobs.
  • Develop CI/CD pipelines and observability stack for infrastructure automation.
  • Deploy and manage storage infrastructure for AI/ML workloads (object, block, distributed file systems).

📋 Requirements

  • 5+ years experience in cloud infrastructure or distributed systems in production.
  • Deep expertise in bare-metal provisioning (IPMI, Redfish, PXE, BMC management).
  • Proficient in GPU scheduling and orchestration with memory and topology management.
  • Strong skills in Terraform or Pulumi and infrastructure CI/CD.

✨ Nice to Have

  • Experience with high-performance networking (InfiniBand, RoCE).
  • Familiarity with distributed storage systems like Ceph, Weka, or VAST Data.

🎁 Benefits & Perks

  • 🚀 Cutting-edge AI infrastructure work with a Series A startup.
  • 🏢 Collaborative team led by PhD co-founders in AI and CS.
  • 🌍 Remote-friendly with San Francisco HQ.
  • 📈 Equity compensation.
  • 🏥 Health insurance.
0 0 0