7h ago

Senior Site Reliability Engineer

Remote (Atlanta, Austin, San Francisco, Seattle)

$156k-$288k / year

full-timesenior Remotesoftware

🛠 Tech Stack

+2

💼 About This Role

You'll join a specialized team ensuring the reliability of Ditto's edge-to-cloud database technology for enterprise customers. Your core impact involves developing observability solutions, leading incident management, and designing automation to reduce operational overhead. You'll collaborate with product engineering to architect resilient systems and maintain SLOs.

🎯 What You'll Do

  • Develop and maintain observability solutions with Datadog, Prometheus, and Grafana
  • Lead incident management and coordinate response efforts
  • Partner with product teams to design reliable systems
  • Implement SLOs, monitoring, and alerting strategies
  • Design automation to improve system resilience

📋 Requirements

  • 6+ years in Site Reliability Engineering or similar DevOps roles
  • Strong experience with Prometheus, Grafana, and Datadog
  • Proficiency in at least one systems language: Python, Go, Rust, C/C++, or Java
  • Expertise with Infrastructure as Code tools (Terraform, Helm)
  • Expertise with at least one major cloud provider: AWS, GCP, or Azure

✨ Nice to Have

  • Experience building multi-tenant, multi-cloud SaaS/DBaaS platforms
  • 4+ years architecting applications for cloud platforms
  • Knowledge of edge computing or mesh networking

🎁 Benefits & Perks

  • 💰 Competitive salary and meaningful equity
  • 🏥 Health, dental, vision, life, and disability insurance
  • 🏖️ Flexible time off
  • 🏢 Access to Atlanta and San Francisco offices
  • 💼 401(k) and flexible spending accounts
0 0 0