3h ago

Senior Site Reliability Engineer

New York, NY, United States of America
full-timesenior Remotefinancial technology

Tech Stack

Description

As a Senior SRE at Block, you'll improve the reliability of critical infrastructure by building distributed platforms, leveraging AI-driven tooling for observability and incident response. You'll lead incident command during high-severity events and drive reliability improvements across the company.

Requirements

  • 5+ years of software development experience
  • Experience running production oncall for high-availability systems
  • Strong incident management skills — structured triage, mitigation under pressure, blameless postmortems
  • Fluency with CI/CD pipelines, progressive rollout strategies, and rollback automation
  • Monitoring observability expertise — building/tuning alerts for uptime, error rates, latency regression, and resource exhaustion

Responsibilities

  • Build and extend platforms to improve system reliability
  • Standardize reliability tools across multiple platforms and organizations
  • Triage, coordinate, and lead stabilization of sev 0–1 incidents
  • Design and implement safe deployment patterns (progressive delivery, automated rollback, guardrails)
  • Use AI-driven systems to improve signal detection, reduce noise, and accelerate root cause analysis
0 views 0 saves 0 applications