8h ago
Staff + Sr. Software Engineer, Cloud Inference Launch Engineering
San Francisco, CA | Seattle, WA
$320k-$485k / year
full-timeseniorai-ml Visa Sponsor
๐ Tech Stack
๐ผ About This Role
You'll join Anthropic's Cloud Inference team to own the validation pipeline for inference server and load balancer across cloud platforms. Your work directly determines how fast frontier models and features ship to every cloud provider, reclaiming capacity when compute is scarcest. You'll bridge gaps between first-party and CSP inference behavior, designing CI/CD infrastructure for model launches.
๐ฏ What You'll Do
- Bring up inference for new model architectures and ship to cloud platforms
- Integrate new inference features like structured sampling and prompt caching
- Identify and fix cross-platform inference gaps at the source
- Design and build CI/CD infrastructure for inference server and load balancer
- Drive down merge-to-production cycle time with faster validation
๐ Requirements
- Strong background in distributed systems serving millions of users
- Track record of building automation or test infrastructure improving release velocity
- Experience with at least one major cloud platform (AWS, GCP, or Azure)
- Exposure to Kubernetes, Infrastructure as Code, or container orchestration
โจ Nice to Have
- LLM inference optimization, batching, and caching strategies
- Proficiency in Python or Rust
- Experience with multi-region deployments and global traffic management
๐ Benefits & Perks
- ๐๏ธ Unlimited PTO
- ๐ฅ Health insurance
- ๐ต Competitive salary
- ๐ Equity
๐จ Hiring Process
Estimated timeline: 3-5 weeks ยท AI estimate
- 1Phone Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3On-site Interviewsยท 4 hours
0 0 0