1d ago

ML Infrastructure Engineer

Amsterdam, Netherlands

โœจ $150k-$250k / yearest.

full-timesenior Remotesoftware

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll lead GPU benchmarking for machine learning workloads at Nebius, a full-stack AI cloud platform. Your work will directly influence platform optimisation and next-gen hardware development. You'll collaborate with experts across hardware and software teams.

๐ŸŽฏ What You'll Do

  • Profile and analyze GPU performance at system and kernel level
  • Compare GPU performance across platforms and software stacks
  • Debug and optimize ML workloads for GPU efficiency
  • Perform acceptance testing for new GPU clusters

๐Ÿ“‹ Requirements

  • Deep understanding of theoretical ML foundations
  • Experience with deep learning frameworks (PyTorch, JAX, Megatron-LM)
  • Knowledge of GPU stack (CUDA, NCCL, drivers)
  • Familiarity with containerized environments (Docker, Kubernetes)

โœจ Nice to Have

  • Familiarity with LLM inference frameworks (vLLM, SGLang)
  • Proficiency in Python and performance profiling tools
  • Contributions to open-source ML benchmarking

๐ŸŽ Benefits & Perks

  • ๐Ÿ’ฐ Competitive compensation
  • ๐Ÿ“ˆ Career growth and learning
  • ๐Ÿ–๏ธ Flexibility and work-life balance
  • ๐Ÿค Collaborative culture
  • ๐ŸŒ International environment

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter screenยท 30 min
  2. 2Technical interviewยท 60 min
  3. 3System designยท 60 min
  4. 4Offerยท 30 min
0 0 0