1d ago
Senior Site Reliability Engineer
San Francisco, CA or Remote (USA)
$190k-$206k / year
full-timesenior Remotesoftware
๐ Tech Stack
๐ผ About This Role
You'll design and operate highly scalable, fault-tolerant systems for a remote-first startup building audit software. Your work will ensure reliability and observability across production environments, directly impacting practitioner work-life balance.
๐ฏ What You'll Do
- Design and operate highly scalable, fault-tolerant systems in distributed cloud environments.
- Define and implement SLOs, SLIs, and error budgets to guide reliability decisions.
- Build and improve observability systems (metrics, logs, tracing) for deep system visibility.
- Automate operational processes to reduce manual toil and improve resilience.
๐ Requirements
- 5+ years of experience in site reliability engineering or related discipline
- Strong experience operating distributed systems in cloud environments (AWS preferred)
- Hands-on experience building and managing observability platforms (Datadog, Prometheus, Grafana, CloudWatch)
- Proficiency with Infrastructure as Code tooling (Terraform or equivalent)
โจ Nice to Have
- Experience implementing distributed tracing systems (OpenTelemetry or similar)
- Experience with capacity planning and performance benchmarking at scale
- Familiarity with database performance tuning and observability across high-traffic systems
๐ Benefits & Perks
- ๐ต Competitive compensation packages with meaningful ownership
- ๐๏ธ Flexible PTO
- ๐ฐ 401k
- ๐ง Wellness benefits (free therapy sessions)
- ๐ป Technology & Work from Home reimbursement
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3Hiring Manager Interviewยท 45 min
0 0 0