1d ago

Senior Site Reliability Engineer

San Francisco, CA or Remote (USA)

$190k-$206k / year

full-timesenior Remotesoftware

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll design and operate highly scalable, fault-tolerant systems for a remote-first startup building audit software. Your work will ensure reliability and observability across production environments, directly impacting practitioner work-life balance.

๐ŸŽฏ What You'll Do

  • Design and operate highly scalable, fault-tolerant systems in distributed cloud environments.
  • Define and implement SLOs, SLIs, and error budgets to guide reliability decisions.
  • Build and improve observability systems (metrics, logs, tracing) for deep system visibility.
  • Automate operational processes to reduce manual toil and improve resilience.

๐Ÿ“‹ Requirements

  • 5+ years of experience in site reliability engineering or related discipline
  • Strong experience operating distributed systems in cloud environments (AWS preferred)
  • Hands-on experience building and managing observability platforms (Datadog, Prometheus, Grafana, CloudWatch)
  • Proficiency with Infrastructure as Code tooling (Terraform or equivalent)

โœจ Nice to Have

  • Experience implementing distributed tracing systems (OpenTelemetry or similar)
  • Experience with capacity planning and performance benchmarking at scale
  • Familiarity with database performance tuning and observability across high-traffic systems

๐ŸŽ Benefits & Perks

  • ๐Ÿ’ต Competitive compensation packages with meaningful ownership
  • ๐Ÿ–๏ธ Flexible PTO
  • ๐Ÿ’ฐ 401k
  • ๐Ÿง˜ Wellness benefits (free therapy sessions)
  • ๐Ÿ’ป Technology & Work from Home reimbursement

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3Hiring Manager Interviewยท 45 min
0 0 0