23h ago
VLM Research Engineer
Berlin
✨ $150k-$200k / yearest.
full-timeseniorai-ml
🛠 Tech Stack
💼 About This Role
You'll push the limits of vision-language models for real-world video understanding at an AI-first startup. You'll design and adapt multimodal models and turn them into production pipelines used by customers. This role combines cutting-edge research with applied engineering in a fast-moving team.
🎯 What You'll Do
- Design and adapt vision-language models for video understanding
- Build and maintain large-scale training pipelines on GPU clusters
- Curate and augment video-text and action datasets
- Develop robust benchmarks for video QA and temporal understanding
- Deliver production-ready inference pipelines to product teams
📋 Requirements
- PhD in computer vision, machine learning, or related field
- Strong background in video-centric deep learning
- Experience training large vision or VLM models (e.g., InternVL)
- Proven work with multi-GPU training (PyTorch, distributed)
- Solid engineering habits: clean Python, reproducible experiments
✨ Nice to Have
- Publications at top-tier venues (CVPR, ICCV, NeurIPS)
- Experience with 3D/4D scene representations or action generation
- Inference optimization: quantization, TensorRT, model distillation
🎁 Benefits & Perks
- 💰 Competitive salary & stock options
- 🌍 Collaborative, diverse team with flat hierarchy
- ⏰ Flexible working hours
- 🎯 Real-world impact in manufacturing AI
- 🤝 Supportive culture promoting underrepresented groups
0 0 0