21h ago

MLE/MLOps

Paris

$100k-$150k / year

full-timemid Hybridai-ml

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll own our inference infrastructure end-to-end, optimizing latency, throughput, and cost across our model fleet. You'll build and scale model serving with TensorZero, vLLM/SGlang/TRT, and Kubernetes, and turn research into production-ready products. This role bridges Research and Product to ship models that are fast, cheap, and production-ready.

๐ŸŽฏ What You'll Do

  • Own inference infrastructure end-to-end: optimize latency, throughput, and cost.
  • Build and scale model serving with TensorZero, vLLM, SGlang, TRT, and Kubernetes.
  • Design and maintain vector search pipelines with vector storages.
  • Turn research into product: format, sample, deploy experimental models.

๐Ÿ“‹ Requirements

  • 3+ years shipping high performance ML systems in production.
  • Deep hands-on experience with inference optimization.
  • Comfortable across the stack: from CUDA kernels to Kubernetes manifests to Grafana dashboards.

โœจ Nice to Have

  • Rust experience
  • Custom Triton kernels
  • Benchmarks

๐ŸŽ Benefits & Perks

  • ๐Ÿ’ฐ Competitive salary with equity
  • ๐Ÿ–๏ธ 20 days of paid vacation
  • ๐Ÿข Hybrid work in Paris + relocation package
  • ๐Ÿฅ Best medical insurance in France
  • ๐Ÿ–ฅ๏ธ All hardware, tools, and AI subscriptions covered

๐Ÿ“จ Hiring Process

Estimated timeline: 1-2 weeks

  1. 1Intro call with a colleagueยท 30 min
  2. 2Take-home exerciseยท 2-3 hours
  3. 3Technical interviewยท 1 hour
  4. 4Final call with CEO and CTOยท 45 min

๐Ÿšฉ Heads Up

  • Department listed as 'Research team' but role is MLE/MLOps mixing research and production.
  • The description includes 'Familiarity with support metrics' which is not typical for MLE role.
0 0 0