2d ago

Lead Member of Technical Staff, Inference Infrastructure

San Francisco

โœจ $250k-$350k / yearest.

full-timelead Remoteai-ml

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll lead the architecture and strategy for deploying optimized NLP models in low latency, high throughput, and high availability environments. You will drive technical leadership across multiple teams and serve as a key point of contact for customized customer deployments. This role offers the chance to shape next-generation AI platforms at a leading frontier model company.

๐ŸŽฏ What You'll Do

  • Lead architecture and strategy for deploying optimized NLP models to production.
  • Design customized deployments for customers with specific infrastructure needs.
  • Mentor engineers and raise technical bar across the team.
  • Oversee resource and cost management for compute, storage, and network.

๐Ÿ“‹ Requirements

  • 8+ years of engineering experience running production infrastructure at large scale.
  • Demonstrated experience leading architecture of highly available distributed systems with Kubernetes and GPU workloads.
  • Deep expertise in Kubernetes dev and production coding, including setting team standards.
  • Extensive experience across GCP, Azure, AWS, OCI, and multi-cloud hybrid environments.

โœจ Nice to Have

  • Experience with accelerators (GPUs, TPUs) for latency and throughput improvements.
  • Proficiency in Golang or C++ for high-performance servers.
  • Experience in cross-functional leadership and mentoring.

๐ŸŽ Benefits & Perks

  • ๐Ÿค Open and inclusive culture
  • ๐Ÿง‘โ€๐Ÿ’ป Work with cutting-edge AI research team
  • ๐Ÿฝ Weekly lunch stipend and in-office snacks
  • ๐Ÿฆท Full health and dental benefits including mental health budget
  • โœˆ๏ธ 6 weeks vacation (30 working days)

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3System Design Interviewยท 60 min
  4. 4Hiring Managerยท 45 min
0 0 0