7h ago

Senior Site Reliability Engineer

Arlington, VA

$180k-$220k / year

full-timeseniorsoftware

🛠 Tech Stack

+1

💼 About This Role

You'll own the reliability, scalability, and security of production applications for a military staff collaboration platform. You will implement observability platforms and lead incident response to ensure best-in-class service quality. This role requires an active Top Secret clearance and on-site work in Arlington, VA.

🎯 What You'll Do

  • Design and manage monitoring, logging, and alerting stack (Prometheus, Loki, Grafana).
  • Define and uphold SLIs and SLOs for system reliability.
  • Lead incident response and blameless post-mortems.
  • Automate infrastructure with Terraform, Ansible, and Kubernetes.

📋 Requirements

  • 5+ years in Platform, DevOps, or SRE with infrastructure focus.
  • Active Top Secret clearance with SCI eligibility.
  • Proficiency in Kubernetes design, deployment, and operations.
  • Experience with Terraform and Ansible.

✨ Nice to Have

  • Experience in DoD environments and compliance frameworks (RMF, STIGs).
  • Familiarity with GitOps practices and toolchains.
  • Experience designing SLIs/SLOs with error budgets.

🎁 Benefits & Perks

  • 💰 Competitive salary $180K–$220K plus equity.
  • 🏠 Remote-friendly with relocation assistance provided.
  • 📈 Growth opportunities at a $2.15B valued startup.
  • 🛡️ Work on mission-critical defense software.
  • 🤝 Collaborative culture with blameless postmortems.
0 0 0