Lead Member of Technical Staff, Inference Infrastructure at Cohere

2d ago

Lead Member of Technical Staff, Inference Infrastructure

San Francisco

✨ $250k-$350k / yearest.

full-timelead Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll lead the architecture and strategy for deploying optimized NLP models in low latency, high throughput, and high availability environments. You will drive technical leadership across multiple teams and serve as a key point of contact for customized customer deployments. This role offers the chance to shape next-generation AI platforms at a leading frontier model company.

🎯 What You'll Do

Lead architecture and strategy for deploying optimized NLP models to production.
Design customized deployments for customers with specific infrastructure needs.
Mentor engineers and raise technical bar across the team.
Oversee resource and cost management for compute, storage, and network.

📋 Requirements

8+ years of engineering experience running production infrastructure at large scale.
Demonstrated experience leading architecture of highly available distributed systems with Kubernetes and GPU workloads.
Deep expertise in Kubernetes dev and production coding, including setting team standards.
Extensive experience across GCP, Azure, AWS, OCI, and multi-cloud hybrid environments.

✨ Nice to Have

Experience with accelerators (GPUs, TPUs) for latency and throughput improvements.
Proficiency in Golang or C++ for high-performance servers.
Experience in cross-functional leadership and mentoring.

🎁 Benefits & Perks

🤝 Open and inclusive culture
🧑‍💻 Work with cutting-edge AI research team
🍽 Weekly lunch stipend and in-office snacks
🦷 Full health and dental benefits including mental health budget
✈️ 6 weeks vacation (30 working days)

📨 Hiring Process

Estimated timeline: 2-4 weeks · AI estimate

1Recruiter Screen· 30 min
2Technical Interview· 60 min
3System Design Interview· 60 min
4Hiring Manager· 45 min

Cohere

Cohere Jobs

Other jobs at Cohere

No other jobs found.

0 0 0