11h ago
Software Engineer - Voice AI (Inference Runtime)
San Francisco
$165k-$330k / year
full-timesenior Hybridai-ml
๐ Tech Stack
๐ผ About This Role
You'll own Baseten Voice AI inference stack, bringing state-of-the-art open source models into production for Voice AI customers. You'll drive real-time, large-scale model serving for STT, TTS, and voice agents, impacting productivity, customer service, and healthcare. This role offers high ownership and cross-team collaboration.
๐ฏ What You'll Do
- Own and lead Voice AI product areas end-to-end.
- Design, build, and operate real-time model serving systems.
- Drive cross-team collaboration on full-stack technical problems.
- Mentor teammates through code reviews and design docs.
๐ Requirements
- Bachelor's degree or higher in Computer Science or related field.
- Proven track record owning production-grade real-time, large-scale systems with tail latency concerns.
- Proficient coding in one or more programming/scripting languages (Python preferred).
- Comfortable using AI coding assistants (e.g., Claude Code, Cursor) daily.
โจ Nice to Have
- Experience implementing pipeline-level model runtime optimizations (dynamic batching, async scheduling).
- Experience building developer platforms (SDKs, CLIs, APIs) for ML or infrastructure.
- Familiarity with speech/audio ML models (STT, TTS) and model-serving runtimes (vLLM, TensorRT).
๐ Benefits & Perks
- ๐ฐ Competitive compensation including meaningful equity.
- ๐ฅ 100% coverage of medical, dental, and vision insurance for employee and dependents.
- ๐๏ธ Flexible PTO including company-wide Winter Break.
- ๐ถ Paid parental leave and fertility stipend through Carrot.
- ๐ 401(k) company-facilitated.
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Phone Interviewยท 60 min
- 3Onsite (System Design + Behavioral)ยท 4 hours
0 0 0