23h ago

Machine Learning Engineer, Model Evaluations (Speech LLM)

San Francisco, CA

$180k-$270k / year

full-time Hybridai-ml

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll shape evaluation metrics for speech LLMs and automated quality scoring in a fast-growing AI startup.

๐ŸŽฏ What You'll Do

  • Design and build evaluation harnesses for speech LLM checkpoints
  • Define measurable benchmarks for speech model capabilities
  • Own dashboards tracking model health during training
  • Debug performance regressions across model and infrastructure

๐Ÿ“‹ Requirements

  • Python proficiency and experience with distributed systems
  • Ability to translate ambiguous concepts into automated metrics
  • Experience with data pipelines or evaluation harnesses at scale
  • Strong communication of statistical results to stakeholders

โœจ Nice to Have

  • Experience with speech metrics like WER, CER, PESQ
  • LLM-as-a-Judge evaluation experience
  • Human evaluation and crowdsourcing management

๐ŸŽ Benefits & Perks

  • ๐Ÿ’ฐ Competitive Compensation: $180K - $270K base + bonus + equity
  • ๐Ÿ–๏ธ Unlimited PTO plus 13 paid holidays
  • ๐Ÿฅ Top-tier healthcare including dental and vision
  • ๐Ÿ‘ถ 12 weeks paid parental leave
  • ๐Ÿ’ป Choice of top laptops/workstations

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3Onsite Interviewยท 120 min
0 0 0