8h ago

Staff + Sr. Software Engineer, Cloud Inference Launch Engineering

San Francisco, CA | Seattle, WA

$320k-$485k / year

full-timeseniorai-ml Visa Sponsor

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll join Anthropic's Cloud Inference team to own the validation pipeline for inference server and load balancer across cloud platforms. Your work directly determines how fast frontier models and features ship to every cloud provider, reclaiming capacity when compute is scarcest. You'll bridge gaps between first-party and CSP inference behavior, designing CI/CD infrastructure for model launches.

๐ŸŽฏ What You'll Do

  • Bring up inference for new model architectures and ship to cloud platforms
  • Integrate new inference features like structured sampling and prompt caching
  • Identify and fix cross-platform inference gaps at the source
  • Design and build CI/CD infrastructure for inference server and load balancer
  • Drive down merge-to-production cycle time with faster validation

๐Ÿ“‹ Requirements

  • Strong background in distributed systems serving millions of users
  • Track record of building automation or test infrastructure improving release velocity
  • Experience with at least one major cloud platform (AWS, GCP, or Azure)
  • Exposure to Kubernetes, Infrastructure as Code, or container orchestration

โœจ Nice to Have

  • LLM inference optimization, batching, and caching strategies
  • Proficiency in Python or Rust
  • Experience with multi-region deployments and global traffic management

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Unlimited PTO
  • ๐Ÿฅ Health insurance
  • ๐Ÿ’ต Competitive salary
  • ๐Ÿ“ˆ Equity

๐Ÿ“จ Hiring Process

Estimated timeline: 3-5 weeks ยท AI estimate

  1. 1Phone Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3On-site Interviewsยท 4 hours
0 0 0