12h ago

Software Engineer - Site Reliability Engineering

London

โœจ $150k-$200k / yearest.

full-timesenior Hybridsoftware

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll automate for insight and scale, building systems that make troubleshooting fast and safe across thousands of Neo4j instances. You'll treat operations as a software problem, replacing tribal knowledge with codified practices. This role focuses on embedding SRE principles at the heart of product development and automating reliability across a global DBaaS platform.

๐ŸŽฏ What You'll Do

  • Build automation for troubleshooting and safe rollouts.
  • Design and improve incident response tooling and processes.
  • Help teams define and act on SLIs and SLOs.
  • Shape observability stack for early issue detection.

๐Ÿ“‹ Requirements

  • Proficiency in Go for backend tools and automation.
  • Experience applying SRE practices like defining SLIs/SLOs.
  • Expertise in troubleshooting large-scale cloud-based systems.
  • Experience with Kubernetes deployment and management.

โœจ Nice to Have

  • Cluster-level Kubernetes administration.
  • Experience with Kustomize and Terraform.
  • Familiarity with observability tools like Prometheus and Grafana.

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Unlimited PTO
  • ๐Ÿ’ฐ Competitive salary and equity
  • ๐Ÿฅ Health insurance
  • ๐Ÿ  Remote-friendly culture
  • ๐Ÿ“ˆ Professional development budget

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3Hiring Managerยท 45 min
0 0 0