4h ago

Member of Technical Staff - Model Serving / API Backend Engineer

Freiburg, Germany | San Francisco, CA

$180k-$300k / year

full-timesenior Hybridai-ml

๐Ÿ›  Tech Stack

+1

๐Ÿ’ผ About This Role

You'll own the bridge between research breakthroughs and production systems, turning research checkpoints into production-ready inference services. You'll design and maintain high-performance APIs serving millions of requests and optimize inference latency across GPU infrastructure. This role removes the bottleneck between frontier research and production reality.

๐ŸŽฏ What You'll Do

  • Turn research checkpoints into production-ready inference services
  • Design and maintain high-performance APIs serving millions of requests
  • Optimize inference latency and throughput across GPU infrastructure
  • Build scalable serving architectures that handle unpredictable traffic

๐Ÿ“‹ Requirements

  • Experience building and operating ML inference services in production
  • Proficiency in Python, FastAPI, and async systems
  • Experience scaling APIs or ML systems under load
  • Comfort working in fast-moving, research-adjacent environments

โœจ Nice to Have

  • Real-time or low-latency inference systems
  • TensorRT, reduced precision, or model compilation
  • Frontend demo tooling (Streamlit, Gradio, React)

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Flexible Hybrid Work
  • ๐ŸŒ International Team with offices in Freiburg and SF
  • ๐Ÿš€ Cutting-Edge AI Research environment
  • ๐Ÿ’ป Open Science Culture

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Callยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3System Design Interviewยท 60 min
0 0 0