Research Engineer, Evaluations at Jobs at AssemblyAI

5h ago

Research Engineer, Evaluations

Remote - New York

full-timesenior RemoteVoice AI

Tech Stack

Description

You will own the evaluation infrastructure for streaming speech-to-text models, ensuring we measure the right things and benchmark against competitors. You'll translate customer feedback into quantifiable metrics, manage datasets, and maintain evaluation pipelines to accelerate research.

Requirements

Machine Learning / Research Engineering background
Experience with evaluation benchmarking and metrics development
Ability to communicate with customer-facing teams and researchers
Familiarity with voice agent ecosystems (e.g., LiveKit, Pipecat, Vapi)
Strong analytical skills to convert vague feedback into concrete metrics

Responsibilities

Own end-to-end and integration-level model evaluation for accuracy, latency, and feature-specific metrics
Build and maintain competitive benchmarking pipelines against other providers
Design and run systematic experiments to measure impact of model changes
Onboard, curate, and maintain evaluation datasets including public benchmarks and internal test sets
Define evaluation metrics capturing real-world performance and translate qualitative customer feedback into quantifiable criteria

Jobs at AssemblyAI

About AssemblyAI AssemblyAI builds the best-in-class Speech AI models powering the next generation of voice applications. Our models serve 600M+ inference calls monthly, process 1M+ hours of audio daily, and power 2 billion+ end-user experiences—from voice agents and meeting assistants to contact centers and medical scribes. Companies like Zoom, Granola, Fireflies, Cluely, and Calabrio rely on AssemblyAI to ship production-ready voice AI. We're at an inflection point in Speech AI. We released Universal-Streaming in mid-2025, and it has quickly earned its place as the model offering the best accuracy-latency-cost tradeoff on the market. The adoption has been significant: we now process ~1.5M streaming hours per week, with 25x usage growth in the last six months alone. Our research team drives these advances and ships with relentless velocity. Since releasing Universal-Streaming, we've already launched keyterms prompting feature and multilingual support —with more significant improvements on the roadmap. We've raised $115M+ from Accel, Insight Partners, Y Combinator's AI Fund, Patrick and John Collison, Nat Friedman, and Daniel Gross. We're a remote team building one of the next great AI companies—and we're looking for people who will shape its future.

Other jobs at Jobs at AssemblyAI

No other jobs found.

0 views 0 saves 0 applications