Audio Inference Engineer, Model Efficiency at Cohere

17h ago

Audio Inference Engineer, Model Efficiency

New York

✨ $150k-$230k / yearest.

full-timesenior Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll join a team optimizing audio inference serving efficiency using innovative techniques. You'll advance core metrics like latency, throughput, and quality for real-time audio processing. You'll collaborate with training and serving infrastructure teams for seamless deployment.

🎯 What You'll Do

Optimize audio inference serving systems for latency and throughput
Identify bottlenecks in audio processing and streaming workloads
Develop creative solutions for real-time audio inference
Collaborate with training and serving infrastructure teams

📋 Requirements

Significant experience developing high-performance audio or ML inference systems
Proficiency in C++ and Python
Hands-on experience with deep learning models for audio, speech, or language

✨ Nice to Have

GPU programming and low-level system optimization
Experience with duplex real-time streaming architectures
Internals of ML frameworks for audio (PyTorch, TensorFlow)

🎁 Benefits & Perks

🤝 Open and inclusive culture
🧑‍💻 Work on cutting-edge AI research
🍽 Weekly lunch stipend and in-office lunches
🦷 Full health and dental benefits with mental health budget
✈️ 6 weeks of vacation

Cohere

Cohere Jobs

Other jobs at Cohere

No other jobs found.

0 0 0