12h ago
Staff ML Performance Engineer
London, UK
โจ $200k-$250k / yearest.
full-timelead Hybrid
๐ Tech Stack
๐ผ About This Role
You'll optimise ML inference for edge accelerators and GPUs, driving the team's focus on running large transformer models efficiently on low-cost, low-power devices. Your work directly enables Wayve's first driving product by turning models into reliable production systems on in-vehicle compute. This is a hands-on role contributing to high-impact, early-stage projects.
๐ฏ What You'll Do
- Profile and pinpoint bottlenecks across the full inference stack.
- Implement optimisations in compilers, runtimes, and kernels.
- Build robust benchmarking and regression testing for performance.
- Optimise for multiple targets (e.g. NVIDIA Orin/Thor, Qualcomm).
๐ Requirements
- Proven experience improving performance in production systems with tight constraints.
- Strong proficiency with at least one relevant stack/toolchain (e.g. TensorRT, CUDA, QNN, Triton, OpenCL).
- Comfort operating at multiple levels of abstraction from high-level model behaviour to low-level execution.
- Strong software engineering fundamentals (debugging, profiling, testing, maintainable code).
โจ Nice to Have
- Exposure to embedded or edge deployment of ML models.
- Experience with NVIDIA and/or Qualcomm SoCs and performance tooling.
- Python and C++ proficiency.
๐ Benefits & Perks
- ๐๏ธ Hybrid working policy combining office and home time.
- ๐ High-impact projects in autonomous driving.
- ๐ Diverse and inclusive culture.
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Phone Screenยท 60 min
- 3Onsite Interview (3-4 rounds)ยท 4 hours
0 0 0