Staff / Principal Machine Learning Engineer, Serving at Inworld AI

8h ago

Staff / Principal Machine Learning Engineer, Serving

Mountain View, California, USA

$270k-$500k / year

full-timelead Hybridai-ml

🛠 Tech Stack

💼 About This Role

You'll lead inference optimization and model serving for a top AI research lab, building realtime multimodal systems. Your work directly powers products used by leading companies. You'll own full-cycle delivery from research to production.

🎯 What You'll Do

Optimize inference serving frameworks like vLLM or TRT-LLM
Profile and accelerate model performance on NVIDIA GPUs
Design distributed systems for multi-GPU/multi-node inference
Containerize and deploy models to production reliably

📋 Requirements

Deep understanding of serving frameworks (vLLM, TRT-LLM)
Hands-on experience with quantization, distillation, caching
Proficiency in C++, CUDA, Rust, or optimized Python
Experience with Kubernetes, Ray, and distributed scaling

✨ Nice to Have

Non-trivial open-source contributions to inference engines
PhD in CS, Physics, Math or equivalent practical experience
Public technical write-ups or deep-dive systems projects

🎁 Benefits & Perks

💰 Competitive base salary $270k-$500k+ bonus+equity
🏢 Relocation assistance to Mountain View office
📈 Equity in a top AI startup backed by major VCs
🏥 Benefits package (not detailed but implied)

Inworld AI

Inworld AI Jobs

Other jobs at Inworld AI

No other jobs found.

0 0 0