1d ago
ML Infrastructure Engineer
Amsterdam, Netherlands
โจ $150k-$250k / yearest.
full-timesenior Remotesoftware
๐ Tech Stack
๐ผ About This Role
You'll lead GPU benchmarking for machine learning workloads at Nebius, a full-stack AI cloud platform. Your work will directly influence platform optimisation and next-gen hardware development. You'll collaborate with experts across hardware and software teams.
๐ฏ What You'll Do
- Profile and analyze GPU performance at system and kernel level
- Compare GPU performance across platforms and software stacks
- Debug and optimize ML workloads for GPU efficiency
- Perform acceptance testing for new GPU clusters
๐ Requirements
- Deep understanding of theoretical ML foundations
- Experience with deep learning frameworks (PyTorch, JAX, Megatron-LM)
- Knowledge of GPU stack (CUDA, NCCL, drivers)
- Familiarity with containerized environments (Docker, Kubernetes)
โจ Nice to Have
- Familiarity with LLM inference frameworks (vLLM, SGLang)
- Proficiency in Python and performance profiling tools
- Contributions to open-source ML benchmarking
๐ Benefits & Perks
- ๐ฐ Competitive compensation
- ๐ Career growth and learning
- ๐๏ธ Flexibility and work-life balance
- ๐ค Collaborative culture
- ๐ International environment
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter screenยท 30 min
- 2Technical interviewยท 60 min
- 3System designยท 60 min
- 4Offerยท 30 min
0 0 0