21h ago

Staff Computer Vision Researcher (LLM)

London

โœจ $210k-$290k / yearest.

full-timelead Hybridai-ml

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll bridge 3D computer vision and large language models to create a unified framework for spatial reasoning. Your core impact will be enabling systems to perform context-aware navigation and answer complex questions about the physical world. This role stands out for building the geospatial AI foundation at a company with a proprietary database of over 30 billion posed images.

๐ŸŽฏ What You'll Do

  • Lead cross-modal grounding between 3D features and language embeddings.
  • Develop algorithms for continuous semantics in 3D maps.
  • Build spatial reasoning systems for Embodied AI.
  • Define benchmarks for measuring spatial common sense in LLMs.

๐Ÿ“‹ Requirements

  • PhD in Computer Vision, Machine Learning, or Robotics with focus on multimodal understanding.
  • 4+ years of ML research with models bridging 3D Vision and Language.
  • Expert knowledge of 3D Geometry (SfM, SLAM, VPS) and Transformer-based architectures.
  • Multiple first-author publications at CVPR, NeurIPS, or ICLR.

โœจ Nice to Have

  • Experience with Gaussian Splatting or NeRFs.
  • Background in robotics (ROS) or agentic systems.
  • Experience with open-set recognition or Zero-Shot learning.

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Unlimited PTO
  • ๐Ÿฅ Comprehensive health insurance
  • ๐Ÿ’ฐ Equity packages
  • ๐Ÿฑ Daily catered lunch
  • ๐ŸŽ“ Learning & development budget

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3On-site Interviewยท 4 hours
0 0 0