3h ago

Staff Software Engineer, Inference

London, UK

$325,000-$390,000 / year

full-timeseniorArtificial Intelligence

Tech Stack

Description

You will build and maintain the critical systems that serve Claude to millions of users worldwide, working across the entire inference stack from intelligent request routing to fleet-wide orchestration across diverse AI accelerators.

Requirements

  • Significant software engineering experience with distributed systems
  • Experience with performance optimization, large-scale service orchestration
  • Familiarity with LLM inference optimization, batching strategies, multi-accelerator deployments
  • Experience with Kubernetes and cloud infrastructure (AWS, GCP)
  • Proficiency in Python or Rust

Responsibilities

  • Designing intelligent routing algorithms for request distribution across thousands of accelerators
  • Autoscaling compute fleet to match supply with demand across production and research workloads
  • Building production-grade deployment pipelines for releasing new models
  • Integrating new AI accelerator platforms
  • Contributing to inference features like structured sampling and prompt caching
0 views 0 saves 0 applications