4h ago

Senior Site Reliability Engineer

Bay Area, CA, United States of America
full-timesenior RemoteFinancial technology

Tech Stack

Description

As a member of the SRE team at Block, you'll proactively and reactively improve the reliability of the platform and critical infrastructure. You'll leverage AI-driven tooling to enhance observability, accelerate incident detection and response, and reduce operational toil. This includes participating in oncall rotation and leading incident command for high-severity events.

Requirements

  • 5+ years of software development experience
  • Drive to root cause systems with many moving parts
  • Experience running production oncall for high-availability systems
  • Strong incident management skills
  • Fluency with CI/CD pipelines, progressive rollout strategies, and rollback automation

Responsibilities

  • Build and extend platforms to improve system reliability
  • Standardize reliability tools across multiple platforms and organizations
  • Triage, coordinate, and lead stabilization of sev 0–1 incidents
  • Serve as primary oncall, maintaining structured escalation paths
  • Use AI-driven systems to improve signal detection and accelerate root cause analysis
0 views 0 saves 0 applications