9h ago
Research Engineer, Voice
Palo Alto, California
$225k-$325k / year
full-timemidai-ml Visa Sponsor
๐ Tech Stack
๐ผ About This Role
You'll advance the spoken intelligence behind Pi by developing and shipping neural models for speech synthesis, recognition, and real-time dialogue. You'll bridge research and production, turning cutting-edge audio ideas into natural voice experiences for millions of users.
๐ฏ What You'll Do
- Research and optimize neural models for voice and audio.
- Build production-grade training and inference pipelines.
- Run end-to-end experiments from data curation to evaluation.
- Collaborate with ML and product teams to integrate voice models.
๐ Requirements
- 2-5 years experience in audio or speech ML.
- Strong proficiency in PyTorch and large-scale model training.
- Solid understanding of audio/speech fundamentals including spectrograms and vocoders.
- Ability to take a research idea from prototype to production.
โจ Nice to Have
- Experience with diffusion-based TTS or neural audio codecs.
- MS or PhD in a related field.
๐ Benefits & Perks
- ๐๏ธ Unlimited PTO
- ๐ถ Parental leave for all parents and caregivers
- ๐ฅ Diverse medical, dental, and vision options
- ๐ฐ 401k matching program
- ๐ Support for visa needs for international employees
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter screenยท 30 min
- 2Technical interviewยท 60 min
- 3Onsite interviewsยท 4 hours
0 0 0