1d ago
Staff Infrastructure Engineer, Cluster Infrastructure
San Francisco, CA | New York City, NY | Seattle, WA
$320k-$405k / year
full-timelead Hybridai-ml Visa Sponsor
๐ Tech Stack
๐ผ About This Role
You'll own the technical strategy for agent-driven cluster lifecycle management, provisioning high-bandwidth, secure-by-default compute clusters across cloud providers and datacenters. Your work directly enables scaling Claude to millions of users and accelerating AI safety research at a company growing faster than nearly any other.
๐ฏ What You'll Do
- Own technical strategy for agent-driven cluster lifecycle management
- Partner across teams to ingest new compute capacity on time
- Collaborate on physical build-out and high-bandwidth inter-cluster connectivity
- Drive strategy on cluster scalability, homogeneity, and fault tolerance
๐ Requirements
- Deep expertise in distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP/Azure)
- Strong proficiency in at least one systems language (Rust, Go, or Python) and IaC with Terraform
- Track record of leading complex, multi-quarter technical initiatives spanning multiple teams or systems
- Ability to build alignment across senior stakeholders and communicate effectively at all levels
โจ Nice to Have
- 8+ years of software engineering experience including technical lead role
- Experience operating large-scale compute infrastructure at hyperscale (100+ clusters, 10K+ nodes)
- Depth in Kubernetes internals, cluster provisioning, or orchestration systems
๐ Benefits & Perks
- ๐๏ธ Unlimited PTO
- ๐ฅ Comprehensive health insurance
- ๐ฐ Annual salary $320k-$405k
- ๐ Equity packages
- ๐ Visa sponsorship available
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter screenยท 30 min
- 2Technical phone interviewยท 60 min
- 3Onsite interviews (3-4 rounds)ยท 4 hours
0 0 0