4h ago
Member of Technical Staff - Model Serving / API Backend Engineer
Freiburg, Germany | San Francisco, CA
$180k-$300k / year
full-timesenior Hybridai-ml
๐ Tech Stack
+1
๐ผ About This Role
You'll own the bridge between research breakthroughs and production systems, turning research checkpoints into production-ready inference services. You'll design and maintain high-performance APIs serving millions of requests and optimize inference latency across GPU infrastructure. This role removes the bottleneck between frontier research and production reality.
๐ฏ What You'll Do
- Turn research checkpoints into production-ready inference services
- Design and maintain high-performance APIs serving millions of requests
- Optimize inference latency and throughput across GPU infrastructure
- Build scalable serving architectures that handle unpredictable traffic
๐ Requirements
- Experience building and operating ML inference services in production
- Proficiency in Python, FastAPI, and async systems
- Experience scaling APIs or ML systems under load
- Comfort working in fast-moving, research-adjacent environments
โจ Nice to Have
- Real-time or low-latency inference systems
- TensorRT, reduced precision, or model compilation
- Frontend demo tooling (Streamlit, Gradio, React)
๐ Benefits & Perks
- ๐๏ธ Flexible Hybrid Work
- ๐ International Team with offices in Freiburg and SF
- ๐ Cutting-Edge AI Research environment
- ๐ป Open Science Culture
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Interviewยท 60 min
- 3System Design Interviewยท 60 min
0 0 0