21h ago
MLE/MLOps
Paris
$100k-$150k / year
full-timemid Hybridai-ml
๐ Tech Stack
๐ผ About This Role
You'll own our inference infrastructure end-to-end, optimizing latency, throughput, and cost across our model fleet. You'll build and scale model serving with TensorZero, vLLM/SGlang/TRT, and Kubernetes, and turn research into production-ready products. This role bridges Research and Product to ship models that are fast, cheap, and production-ready.
๐ฏ What You'll Do
- Own inference infrastructure end-to-end: optimize latency, throughput, and cost.
- Build and scale model serving with TensorZero, vLLM, SGlang, TRT, and Kubernetes.
- Design and maintain vector search pipelines with vector storages.
- Turn research into product: format, sample, deploy experimental models.
๐ Requirements
- 3+ years shipping high performance ML systems in production.
- Deep hands-on experience with inference optimization.
- Comfortable across the stack: from CUDA kernels to Kubernetes manifests to Grafana dashboards.
โจ Nice to Have
- Rust experience
- Custom Triton kernels
- Benchmarks
๐ Benefits & Perks
- ๐ฐ Competitive salary with equity
- ๐๏ธ 20 days of paid vacation
- ๐ข Hybrid work in Paris + relocation package
- ๐ฅ Best medical insurance in France
- ๐ฅ๏ธ All hardware, tools, and AI subscriptions covered
๐จ Hiring Process
Estimated timeline: 1-2 weeks
- 1Intro call with a colleagueยท 30 min
- 2Take-home exerciseยท 2-3 hours
- 3Technical interviewยท 1 hour
- 4Final call with CEO and CTOยท 45 min
๐ฉ Heads Up
- Department listed as 'Research team' but role is MLE/MLOps mixing research and production.
- The description includes 'Familiarity with support metrics' which is not typical for MLE role.
0 0 0