9h ago
Senior / Lead Machine Learning Engineer
Germany
✨ $160k-$200k / yearest.
full-timeleadai-ml Visa Sponsor
🛠 Tech Stack
💼 About This Role
You'll join Inworld's research lab to optimize realtime multimodal inference for thousands of queries per second. You'll own the full serving pipeline from model to production. This role offers flat structure and fast iterations with top AI researchers.
🎯 What You'll Do
- Optimize inference serving frameworks like vLLM or TRT-LLM
- Implement quantization, distillation, and caching strategies
- Profile and optimize GPU performance with CUDA and C++
- Scale multi-GPU/multi-node inference with Kubernetes and Ray
📋 Requirements
- 3+ years experience in ML serving or inference optimization
- Proficiency in C++, CUDA, or Rust
- Hands-on experience with vLLM or TRT-LLM
- Experience with Kubernetes and Ray for distributed systems
✨ Nice to Have
- PhD in CS, Physics, or Math
- Open-source contributions to inference engines
- Full-cycle ownership from research to production
🎁 Benefits & Perks
- 🚀 Impact-driven culture with minimal process
- 🌍 Relocation support to San Francisco Bay Area possible
- 📚 Open-source contributions encouraged
0 0 0