AI Researcher (Multimodal Audio/Video Generation) at Tavus

14h ago

AI Researcher (Multimodal Audio/Video Generation)

San Francisco, CA | London, UK

✨ $160k-$240k / yearest.

full-timesenior Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll lead research on audio-visual avatar generation for conversational AI humans. Your mission is to push generative models in diffusion and multimodal modeling to new frontiers. You'll translate cutting-edge research into production and publish at top venues.

🎯 What You'll Do

Lead research on audio-visual generation for avatars (Neural Avatars, Talking-Heads).
Design models capturing verbal and non-verbal signals in conversation flow.
Drive innovation in diffusion models, long-video generation, and audio-visual modeling.
Translate research into production with Applied ML and engineering teams.

📋 Requirements

PhD or equivalent research experience.
2-3+ years hands-on experience with generative models at scale.
Expertise in diffusion models and efficiency techniques.
Experience in multimodal generation (video, audio, language).

✨ Nice to Have

Skills in 3D graphics, Gaussian splatting, or large-scale training.
Exposure to generative AI models beyond specialty.
Familiarity with software development best practices.

🎁 Benefits & Perks

🚀 Series A backed by Sequoia, Y Combinator, Scale Venture Partners.
🌐 Remote within US or Europe considered.
🏢 Office in San Francisco (hybrid) or London.

📨 Hiring Process

Estimated timeline: 3-5 weeks · AI estimate

1Recruiter Call· 30 min
2Technical Interview· 60 min
3Research Presentation· 45 min
4Team Interview· 45 min
5Offer· 15 min

Tavus

Tavus Jobs

Other jobs at Tavus

No other jobs found.

0 0 0