12h ago

Staff ML Performance Engineer

London, UK

โœจ $200k-$250k / yearest.

full-timelead Hybrid

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll optimise ML inference for edge accelerators and GPUs, driving the team's focus on running large transformer models efficiently on low-cost, low-power devices. Your work directly enables Wayve's first driving product by turning models into reliable production systems on in-vehicle compute. This is a hands-on role contributing to high-impact, early-stage projects.

๐ŸŽฏ What You'll Do

  • Profile and pinpoint bottlenecks across the full inference stack.
  • Implement optimisations in compilers, runtimes, and kernels.
  • Build robust benchmarking and regression testing for performance.
  • Optimise for multiple targets (e.g. NVIDIA Orin/Thor, Qualcomm).

๐Ÿ“‹ Requirements

  • Proven experience improving performance in production systems with tight constraints.
  • Strong proficiency with at least one relevant stack/toolchain (e.g. TensorRT, CUDA, QNN, Triton, OpenCL).
  • Comfort operating at multiple levels of abstraction from high-level model behaviour to low-level execution.
  • Strong software engineering fundamentals (debugging, profiling, testing, maintainable code).

โœจ Nice to Have

  • Exposure to embedded or edge deployment of ML models.
  • Experience with NVIDIA and/or Qualcomm SoCs and performance tooling.
  • Python and C++ proficiency.

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Hybrid working policy combining office and home time.
  • ๐Ÿ“ˆ High-impact projects in autonomous driving.
  • ๐ŸŒ Diverse and inclusive culture.

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Callยท 30 min
  2. 2Technical Phone Screenยท 60 min
  3. 3Onsite Interview (3-4 rounds)ยท 4 hours
0 0 0