21h ago
Site Reliability Engineer
London, UK
โจ $160k-$220k / yearest.
full-timesenior Hybridai-ml
๐ Tech Stack
+2
๐ผ About This Role
You'll ensure Writer's AI platform is available, performant, and reliable 24/7. You'll build resilient systems and automate across the stack, directly enabling enterprise customers. This hybrid role is based in NYC or London.
๐ฏ What You'll Do
- Automate operational tasks and infrastructure management using Python or Go
- Design scalable, fault-tolerant infrastructure on AWS, GCP, or Azure
- Own reliability and performance of core services with SLOs and error budgets
- Lead incident response, post-mortems, and root cause analyses
๐ Requirements
- 7+ years in site reliability engineering or similar role
- Deep expertise with cloud platforms (AWS preferred) and Docker/Kubernetes
- Proficiency in Python, Java, or Go for automation
- Knowledge of monitoring tools like Prometheus, Grafana, or ELK Stack
โจ Nice to Have
- Experience with large-scale, high-traffic platforms
- Background in enterprise SaaS or AI infrastructure
๐ Benefits & Perks
- ๐๏ธ Generous PTO plus company holidays
- ๐ฅ Comprehensive medical and dental insurance
- ๐ถ Paid parental leave (16 weeks for all parents)
- ๐ Annual learning stipend and wellness stipend
- ๐ฐ Competitive compensation and stock options
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Screenยท 60 min
- 3Onsite Interviewยท 4 hours
0 0 0