2h ago
Staff Technical Program Manager - Reliability Engineering
Menlo Park, CA
full-timeseniorfintech
Description
Lead programs that improve how Robinhood detects, responds to, and learns from system incidents. Define incident management standards, guide response coordination, and ensure follow-up actions are completed. Translate reliability strategy into clear execution plans, aligning engineers and leaders on priorities and measurable outcomes.
Requirements
- 7+ years experience leading technical programs in infrastructure, reliability, or incident management at scale
- Ability to understand system architecture and work closely with senior engineers
- Experience building incident management or reliability processes that improved uptime or response time
- Strong communication skills for high-pressure coordination
- Proven ability to organize complex work across multiple engineering teams
Responsibilities
- Lead incident management programs including response processes, escalation paths, and post-incident tracking
- Define and track follow-up actions after incidents to reduce repeat issues
- Run technical risk assessment program to identify and control top risks
- Manage infrastructure migration projects to build pre-production testing discipline
- Provide updates on program status, risks, and system health metrics to leadership
0 views 0 saves 0 applications