Member of Technical Staff (Inference) at H Company

9h ago

Member of Technical Staff (Inference)

Paris

✨ $150k-$250k / yearest.

full-time Hybridai-ml

🛠 Tech Stack

💼 About This Role

You'll develop and optimize inference pipelines for H's agentic AI models, focusing on high throughput and low latency. You'll collaborate with research teams to enhance model efficiency using advanced techniques like quantization and distributed computing.

🎯 What You'll Do

Develop scalable, low-latency inference pipelines
Optimize model performance using quantization and caching
Develop specialized GPU kernels for attention and matmul
Implement state-of-the-art inference techniques

📋 Requirements

MS or PhD in Computer Science or related field
Proficient in Python, Rust, or C/C++
Experience in GPU programming (CUDA, Triton, Metal)
Experience in model compression and quantization

✨ Nice to Have

Experience with LLM serving frameworks (vLLM, TensorRT-LLM)
Experience with CUDA kernel programming and NCCL
Experience with deep learning inference frameworks (PyTorch, ONNX Runtime)

🎁 Benefits & Perks

🚀 Join a top AI startup in early days
🌍 Collaborate with world-class AI talent
📈 Professional growth opportunities

H Company

H Company Jobs

Other jobs at H Company

No other jobs found.

0 0 0