1d ago
Staff Software Engineer, Node Infra
London, UK
$325k-$485k / year
full-timeleadai-ml Visa Sponsor
๐ Tech Stack
๐ผ About This Role
You'll own the technical strategy and roadmap for node lifecycle management at Anthropic, ensuring reliable and scalable infrastructure for frontier AI research. Your work will directly impact how quickly we train new models and serve Claude to millions of users. This role offers the chance to shape the future of AI infrastructure across multiple clouds and accelerator families.
๐ฏ What You'll Do
- Define and execute the technical strategy for node lifecycle management
- Drive cross-team initiatives to build and scale AI clusters across multiple clouds
- Design and operate automated systems for hardware health detection and remediation
- Establish operational excellence practices including incident response and on-call
๐ Requirements
- Distributed systems expertise with cloud platforms like Kubernetes and Terraform
- Proficiency in Rust, Go, or Python
- Hands-on experience with ML accelerators (GPUs, TPUs, or Trainium)
- Track record of leading multi-quarter technical initiatives across teams
โจ Nice to Have
- 8+ years of software engineering experience as a technical lead
- Experience managing hyperscale compute infrastructure (10K+ nodes)
- Depth in Kubernetes internals or cluster orchestration systems
๐ Benefits & Perks
- ๐ฐ Competitive salary
- ๐ Hybrid work policy (minimum 25% in office)
- ๐ Visa sponsorship available
๐จ Hiring Process
Estimated timeline: 2-4 weeks
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3System Design Interviewยท 60 min
- 4Hiring Manager Interviewยท 45 min
0 0 0