2d ago
Lead Member of Technical Staff, Inference Infrastructure
San Francisco
โจ $250k-$350k / yearest.
full-timelead Remoteai-ml
๐ Tech Stack
๐ผ About This Role
You'll lead the architecture and strategy for deploying optimized NLP models in low latency, high throughput, and high availability environments. You will drive technical leadership across multiple teams and serve as a key point of contact for customized customer deployments. This role offers the chance to shape next-generation AI platforms at a leading frontier model company.
๐ฏ What You'll Do
- Lead architecture and strategy for deploying optimized NLP models to production.
- Design customized deployments for customers with specific infrastructure needs.
- Mentor engineers and raise technical bar across the team.
- Oversee resource and cost management for compute, storage, and network.
๐ Requirements
- 8+ years of engineering experience running production infrastructure at large scale.
- Demonstrated experience leading architecture of highly available distributed systems with Kubernetes and GPU workloads.
- Deep expertise in Kubernetes dev and production coding, including setting team standards.
- Extensive experience across GCP, Azure, AWS, OCI, and multi-cloud hybrid environments.
โจ Nice to Have
- Experience with accelerators (GPUs, TPUs) for latency and throughput improvements.
- Proficiency in Golang or C++ for high-performance servers.
- Experience in cross-functional leadership and mentoring.
๐ Benefits & Perks
- ๐ค Open and inclusive culture
- ๐งโ๐ป Work with cutting-edge AI research team
- ๐ฝ Weekly lunch stipend and in-office snacks
- ๐ฆท Full health and dental benefits including mental health budget
- โ๏ธ 6 weeks vacation (30 working days)
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3System Design Interviewยท 60 min
- 4Hiring Managerยท 45 min
0 0 0