3h ago

Staff Site Reliability Engineer

Remote
full-timesenior Remotehealthcare technology

Tech Stack

Description

As a Staff Site Reliability Engineer at Blink Health, you will establish SRE best practices, drive observability strategy, and design software-driven solutions to automate infrastructure and reduce operational complexity. You will lead technical initiatives, mentor engineers, and partner with cross-functional teams to improve platform resilience and developer productivity.

Requirements

  • Bachelor's or Master's degree in Computer Science or equivalent practical experience
  • 7+ years experience in SRE, infrastructure, or platform engineering
  • Expert-level troubleshooting across full stack from application to kernel to network
  • Deep expertise in Linux systems and networking concepts (TCP/IP, DNS, load balancing)
  • Proficiency in Python, Go, Bash, and experience automating operational work

Responsibilities

  • Establish and evolve SRE best practices including error budgets, incident response, and operational readiness
  • Define and drive observability strategy with SLIs/SLOs, alerting, and dashboards
  • Design and implement software-driven automation to eliminate operational toil
  • Act as technical leader, driving large ambiguous initiatives across infrastructure and platform architecture
  • Provide technical mentorship and architecture guidance to engineering teams
0 views 0 saves 0 applications