2d ago

Customer Reliability Manager

San Francisco

โœจ $180k-$250k / yearest.

full-timeseniorai-ml

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll lead a team of Customer Reliability Engineers providing high-touch support for Braintrust's AI observability platform, focusing on hybrid, BYOC, and SaaS deployments. Your core impact is reducing friction for developers building LLM-powered applications. You'll also own incident response and mentor senior engineers.

๐ŸŽฏ What You'll Do

  • Lead and grow a team of Customer Reliability Engineers.
  • Own the primary after-hours on-call rotation for customer-reported SEV1s.
  • Run incident response and escalation, including hands-on for high-severity issues.
  • Lead new BYOC deployments and upgrades.

๐Ÿ“‹ Requirements

  • 5โ€“10+ years of experience leading support for developer-facing products.
  • Deep familiarity with deploying Terraform, Helm, and Kubernetes infrastructure.
  • Comfortable reviewing, debugging, and reasoning about backend services and infrastructure.
  • Strong ownership of customer-impacting issues end-to-end.

โœจ Nice to Have

  • Familiarity with OpenAI, Anthropic, or similar LLM providers at a systems level.
  • Experience guiding teams working with datasets, evaluation metrics, or prompt engineering.
  • Experience supporting self-hosted offerings (e.g., Terraform, Kubernetes).

๐ŸŽ Benefits & Perks

  • ๐Ÿฅ Medical, dental, and vision insurance
  • ๐Ÿฝ๏ธ Daily lunch, snacks, and beverages
  • ๐ŸŒด Flexible time off
  • ๐Ÿ’ฐ Competitive salary and equity
  • ๐Ÿค– AI Stipend

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Hiring Manager Interviewยท 45 min
  3. 3Technical Interviewยท 60 min

๐Ÿšฉ Heads Up

  • On-call rotation for SEV1s may lead to burnout
  • Requires 5-10+ years of leadership but also hands-on infrastructure work
0 0 0