5h ago
Lead Operations Engineer
Las Vegas, Nevada
✨ $150k-$180k / yearest.
full-timemidai-ml
🛠 Tech Stack
💼 About This Role
You'll be the technical backbone of the Global Operations Center, bridging frontline engineers and platform teams. You'll drive operational maturity by developing runbooks, leading post-incident reviews, and turning reactive firefighting into proactive operations. This role offers the chance to shape the reliability of a cutting-edge GPU cloud platform.
🎯 What You'll Do
- Establish and enforce runbook quality standards with escalation criteria
- Lead post-incident reviews and drive systemic improvements
- Develop runbooks for shift teams to execute independently
- Serve as technical liaison between GOC and engineering
📋 Requirements
- 3+ years in infrastructure operations or SRE in data center/cloud
- Strong understanding of GPU compute infrastructure and Linux
- Experience building runbooks and incident management frameworks
- Familiarity with monitoring tools like Prometheus or Grafana
✨ Nice to Have
- Experience in GPU cloud or AI/ML infrastructure operations
- Background in NOC and SOC functions
- Scripting skills in Python or Bash
🎁 Benefits & Perks
- 📈 Stock Options
- 🏥 100% paid Medical, Dental, and Vision insurance
- 💼 401(k)
- 🏖️ Flexible PTO
- 👶 Parental Leave
0 0 0