2h ago

Staff Technical Program Manager - Reliability Engineering

Menlo Park, CA
full-timeseniorfintech

Description

Lead programs that improve how Robinhood detects, responds to, and learns from system incidents. Define incident management standards, guide response coordination, and ensure follow-up actions are completed. Translate reliability strategy into clear execution plans, aligning engineers and leaders on priorities and measurable outcomes.

Requirements

  • 7+ years experience leading technical programs in infrastructure, reliability, or incident management at scale
  • Ability to understand system architecture and work closely with senior engineers
  • Experience building incident management or reliability processes that improved uptime or response time
  • Strong communication skills for high-pressure coordination
  • Proven ability to organize complex work across multiple engineering teams

Responsibilities

  • Lead incident management programs including response processes, escalation paths, and post-incident tracking
  • Define and track follow-up actions after incidents to reduce repeat issues
  • Run technical risk assessment program to identify and control top risks
  • Manage infrastructure migration projects to build pre-production testing discipline
  • Provide updates on program status, risks, and system health metrics to leadership
0 views 0 saves 0 applications