21h ago
Staff Computer Vision Researcher (LLM)
London
โจ $210k-$290k / yearest.
full-timelead Hybridai-ml
๐ Tech Stack
๐ผ About This Role
You'll bridge 3D computer vision and large language models to create a unified framework for spatial reasoning. Your core impact will be enabling systems to perform context-aware navigation and answer complex questions about the physical world. This role stands out for building the geospatial AI foundation at a company with a proprietary database of over 30 billion posed images.
๐ฏ What You'll Do
- Lead cross-modal grounding between 3D features and language embeddings.
- Develop algorithms for continuous semantics in 3D maps.
- Build spatial reasoning systems for Embodied AI.
- Define benchmarks for measuring spatial common sense in LLMs.
๐ Requirements
- PhD in Computer Vision, Machine Learning, or Robotics with focus on multimodal understanding.
- 4+ years of ML research with models bridging 3D Vision and Language.
- Expert knowledge of 3D Geometry (SfM, SLAM, VPS) and Transformer-based architectures.
- Multiple first-author publications at CVPR, NeurIPS, or ICLR.
โจ Nice to Have
- Experience with Gaussian Splatting or NeRFs.
- Background in robotics (ROS) or agentic systems.
- Experience with open-set recognition or Zero-Shot learning.
๐ Benefits & Perks
- ๐๏ธ Unlimited PTO
- ๐ฅ Comprehensive health insurance
- ๐ฐ Equity packages
- ๐ฑ Daily catered lunch
- ๐ Learning & development budget
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3On-site Interviewยท 4 hours
0 0 0