12h ago
Software Engineer - Site Reliability Engineering
London
โจ $150k-$200k / yearest.
full-timesenior Hybridsoftware
๐ Tech Stack
๐ผ About This Role
You'll automate for insight and scale, building systems that make troubleshooting fast and safe across thousands of Neo4j instances. You'll treat operations as a software problem, replacing tribal knowledge with codified practices. This role focuses on embedding SRE principles at the heart of product development and automating reliability across a global DBaaS platform.
๐ฏ What You'll Do
- Build automation for troubleshooting and safe rollouts.
- Design and improve incident response tooling and processes.
- Help teams define and act on SLIs and SLOs.
- Shape observability stack for early issue detection.
๐ Requirements
- Proficiency in Go for backend tools and automation.
- Experience applying SRE practices like defining SLIs/SLOs.
- Expertise in troubleshooting large-scale cloud-based systems.
- Experience with Kubernetes deployment and management.
โจ Nice to Have
- Cluster-level Kubernetes administration.
- Experience with Kustomize and Terraform.
- Familiarity with observability tools like Prometheus and Grafana.
๐ Benefits & Perks
- ๐๏ธ Unlimited PTO
- ๐ฐ Competitive salary and equity
- ๐ฅ Health insurance
- ๐ Remote-friendly culture
- ๐ Professional development budget
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3Hiring Managerยท 45 min
0 0 0