9h ago

Research Engineer, Voice

Palo Alto, California

$225k-$325k / year

full-timemidai-ml Visa Sponsor

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll advance the spoken intelligence behind Pi by developing and shipping neural models for speech synthesis, recognition, and real-time dialogue. You'll bridge research and production, turning cutting-edge audio ideas into natural voice experiences for millions of users.

๐ŸŽฏ What You'll Do

  • Research and optimize neural models for voice and audio.
  • Build production-grade training and inference pipelines.
  • Run end-to-end experiments from data curation to evaluation.
  • Collaborate with ML and product teams to integrate voice models.

๐Ÿ“‹ Requirements

  • 2-5 years experience in audio or speech ML.
  • Strong proficiency in PyTorch and large-scale model training.
  • Solid understanding of audio/speech fundamentals including spectrograms and vocoders.
  • Ability to take a research idea from prototype to production.

โœจ Nice to Have

  • Experience with diffusion-based TTS or neural audio codecs.
  • MS or PhD in a related field.

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Unlimited PTO
  • ๐Ÿ‘ถ Parental leave for all parents and caregivers
  • ๐Ÿฅ Diverse medical, dental, and vision options
  • ๐Ÿ’ฐ 401k matching program
  • ๐ŸŒ Support for visa needs for international employees

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter screenยท 30 min
  2. 2Technical interviewยท 60 min
  3. 3Onsite interviewsยท 4 hours
0 0 0