7h ago
Senior Site Reliability Engineer
Remote (Atlanta, Austin, San Francisco, Seattle)
$156k-$288k / year
full-timesenior Remotesoftware
🛠 Tech Stack
+2
💼 About This Role
You'll join a specialized team ensuring the reliability of Ditto's edge-to-cloud database technology for enterprise customers. Your core impact involves developing observability solutions, leading incident management, and designing automation to reduce operational overhead. You'll collaborate with product engineering to architect resilient systems and maintain SLOs.
🎯 What You'll Do
- Develop and maintain observability solutions with Datadog, Prometheus, and Grafana
- Lead incident management and coordinate response efforts
- Partner with product teams to design reliable systems
- Implement SLOs, monitoring, and alerting strategies
- Design automation to improve system resilience
📋 Requirements
- 6+ years in Site Reliability Engineering or similar DevOps roles
- Strong experience with Prometheus, Grafana, and Datadog
- Proficiency in at least one systems language: Python, Go, Rust, C/C++, or Java
- Expertise with Infrastructure as Code tools (Terraform, Helm)
- Expertise with at least one major cloud provider: AWS, GCP, or Azure
✨ Nice to Have
- Experience building multi-tenant, multi-cloud SaaS/DBaaS platforms
- 4+ years architecting applications for cloud platforms
- Knowledge of edge computing or mesh networking
🎁 Benefits & Perks
- 💰 Competitive salary and meaningful equity
- 🏥 Health, dental, vision, life, and disability insurance
- 🏖️ Flexible time off
- 🏢 Access to Atlanta and San Francisco offices
- 💼 401(k) and flexible spending accounts
0 0 0