2d ago
Staff Site Reliability Engineer
Berlin
โจ $180k-$220k / yearest.
full-timelead Hybridretail
๐ Tech Stack
๐ผ About This Role
You'll join the Operational Excellence team at GetYourGuide, a travel experiences platform, to prevent incidents and improve reliability. You'll drive observability, cost efficiency, and a culture of operational excellence across all product teams. This role offers the chance to work with AI-powered experiences and shape reliability practices for a global company.
๐ฏ What You'll Do
- Reduce incident frequency, MTTD, and MTTR
- Lead post-incident reviews and drive systemic improvements
- Build tooling and runbooks for production issue diagnosis
- Advance Datadog-based observability practice with SLOs and alerts
๐ Requirements
- Deep understanding of observability tooling, especially Datadog
- Proven experience reducing MTTD, MTTR, and change failure rate
- Strong coding skills in Java; comfortable with Go and frontend context
- Experience with Kubernetes, AWS, and service mesh technologies (Istio/Envoy)
โจ Nice to Have
- Led company-wide DORA metric improvements
- Driven automated testing improvements reducing production incidents
- Embedded operational excellence into product engineering teams
๐ Benefits & Perks
- ๐ Annual personal growth budget and mentorship programs
- ๐ Work from anywhere for 40 days per year
- ๐ช Health and wellness benefits including transportation and fitness budget
- ๐ท๏ธ Discounts on GetYourGuide activities for you and family
- ๐ฃ๏ธ Language reimbursement program
๐จ Hiring Process
Estimated timeline: 3-5 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Screenยท 60 min
- 3On-site Interviewยท 4 hours
0 0 0