17h ago

Research Engineer - Audio & Speech Models

San Francisco, California

โœจ $150k-$250k / yearest.

full-timemidai-ml Visa Sponsor

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll contribute to Zyphra's Audio Team, building open-source audio models like autoencoders and speech-to-speech systems. You'll work on large-scale training runs and architecture improvements. This role offers a chance to publish research in a fast-paced AI company.

๐ŸŽฏ What You'll Do

  • Design and train novel audio autoencoder architectures
  • Optimize performance of large-scale training pipelines
  • Collect and process audio datasets for model training
  • Run ablations to improve training methodologies

๐Ÿ“‹ Requirements

  • Strong research taste and ability to execute projects from conception to write-up
  • Strong implementation ability in PyTorch and Python
  • Expertise in audio models such as TTS, ASR, or speech-to-speech
  • Experience with large-scale GPU cluster training

โœจ Nice to Have

  • Experience with diffusion models or GANs
  • Published research in machine learning venues
  • Postgraduate degree in a scientific subject

๐ŸŽ Benefits & Perks

  • ๐Ÿฅ Comprehensive medical, dental, vision, FSA
  • ๐Ÿ’ฐ Competitive compensation and 401(k) plan
  • โœˆ๏ธ Relocation and immigration support
  • ๐Ÿ• In-office snacks and meals
  • ๐Ÿ–๏ธ Unlimited PTO and company holidays

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Callยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3Onsite Interviewยท 4 hours

๐Ÿšฉ Heads Up

  • Requirements list is lengthy with many preferred qualifications included
  • Job posting mentions 'unlimited PTO' without specifics
  • Role may require deep expertise across multiple areas (audio, ML, engineering)
0 0 0