7h ago
Senior Site Reliability Engineer
Arlington, VA
$180k-$220k / year
full-timeseniorsoftware
🛠 Tech Stack
+1
💼 About This Role
You'll own the reliability, scalability, and security of production applications for a military staff collaboration platform. You will implement observability platforms and lead incident response to ensure best-in-class service quality. This role requires an active Top Secret clearance and on-site work in Arlington, VA.
🎯 What You'll Do
- Design and manage monitoring, logging, and alerting stack (Prometheus, Loki, Grafana).
- Define and uphold SLIs and SLOs for system reliability.
- Lead incident response and blameless post-mortems.
- Automate infrastructure with Terraform, Ansible, and Kubernetes.
📋 Requirements
- 5+ years in Platform, DevOps, or SRE with infrastructure focus.
- Active Top Secret clearance with SCI eligibility.
- Proficiency in Kubernetes design, deployment, and operations.
- Experience with Terraform and Ansible.
✨ Nice to Have
- Experience in DoD environments and compliance frameworks (RMF, STIGs).
- Familiarity with GitOps practices and toolchains.
- Experience designing SLIs/SLOs with error budgets.
🎁 Benefits & Perks
- 💰 Competitive salary $180K–$220K plus equity.
- 🏠 Remote-friendly with relocation assistance provided.
- 📈 Growth opportunities at a $2.15B valued startup.
- 🛡️ Work on mission-critical defense software.
- 🤝 Collaborative culture with blameless postmortems.
0 0 0