17h ago
Research Engineer - Audio & Speech Models
San Francisco, California
โจ $150k-$250k / yearest.
full-timemidai-ml Visa Sponsor
๐ Tech Stack
๐ผ About This Role
You'll contribute to Zyphra's Audio Team, building open-source audio models like autoencoders and speech-to-speech systems. You'll work on large-scale training runs and architecture improvements. This role offers a chance to publish research in a fast-paced AI company.
๐ฏ What You'll Do
- Design and train novel audio autoencoder architectures
- Optimize performance of large-scale training pipelines
- Collect and process audio datasets for model training
- Run ablations to improve training methodologies
๐ Requirements
- Strong research taste and ability to execute projects from conception to write-up
- Strong implementation ability in PyTorch and Python
- Expertise in audio models such as TTS, ASR, or speech-to-speech
- Experience with large-scale GPU cluster training
โจ Nice to Have
- Experience with diffusion models or GANs
- Published research in machine learning venues
- Postgraduate degree in a scientific subject
๐ Benefits & Perks
- ๐ฅ Comprehensive medical, dental, vision, FSA
- ๐ฐ Competitive compensation and 401(k) plan
- โ๏ธ Relocation and immigration support
- ๐ In-office snacks and meals
- ๐๏ธ Unlimited PTO and company holidays
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Interviewยท 60 min
- 3Onsite Interviewยท 4 hours
๐ฉ Heads Up
- Requirements list is lengthy with many preferred qualifications included
- Job posting mentions 'unlimited PTO' without specifics
- Role may require deep expertise across multiple areas (audio, ML, engineering)
0 0 0