23h ago
Machine Learning Engineer, Model Evaluations (Speech LLM)
San Francisco, CA
$180k-$270k / year
full-time Hybridai-ml
๐ Tech Stack
๐ผ About This Role
You'll shape evaluation metrics for speech LLMs and automated quality scoring in a fast-growing AI startup.
๐ฏ What You'll Do
- Design and build evaluation harnesses for speech LLM checkpoints
- Define measurable benchmarks for speech model capabilities
- Own dashboards tracking model health during training
- Debug performance regressions across model and infrastructure
๐ Requirements
- Python proficiency and experience with distributed systems
- Ability to translate ambiguous concepts into automated metrics
- Experience with data pipelines or evaluation harnesses at scale
- Strong communication of statistical results to stakeholders
โจ Nice to Have
- Experience with speech metrics like WER, CER, PESQ
- LLM-as-a-Judge evaluation experience
- Human evaluation and crowdsourcing management
๐ Benefits & Perks
- ๐ฐ Competitive Compensation: $180K - $270K base + bonus + equity
- ๐๏ธ Unlimited PTO plus 13 paid holidays
- ๐ฅ Top-tier healthcare including dental and vision
- ๐ถ 12 weeks paid parental leave
- ๐ป Choice of top laptops/workstations
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3Onsite Interviewยท 120 min
0 0 0