5h ago

Lead Operations Engineer

Las Vegas, Nevada

$150k-$180k / yearest.

full-timemidai-ml

🛠 Tech Stack

💼 About This Role

You'll be the technical backbone of the Global Operations Center, bridging frontline engineers and platform teams. You'll drive operational maturity by developing runbooks, leading post-incident reviews, and turning reactive firefighting into proactive operations. This role offers the chance to shape the reliability of a cutting-edge GPU cloud platform.

🎯 What You'll Do

  • Establish and enforce runbook quality standards with escalation criteria
  • Lead post-incident reviews and drive systemic improvements
  • Develop runbooks for shift teams to execute independently
  • Serve as technical liaison between GOC and engineering

📋 Requirements

  • 3+ years in infrastructure operations or SRE in data center/cloud
  • Strong understanding of GPU compute infrastructure and Linux
  • Experience building runbooks and incident management frameworks
  • Familiarity with monitoring tools like Prometheus or Grafana

✨ Nice to Have

  • Experience in GPU cloud or AI/ML infrastructure operations
  • Background in NOC and SOC functions
  • Scripting skills in Python or Bash

🎁 Benefits & Perks

  • 📈 Stock Options
  • 🏥 100% paid Medical, Dental, and Vision insurance
  • 💼 401(k)
  • 🏖️ Flexible PTO
  • 👶 Parental Leave
0 0 0