6h ago
Site Reliability Engineer
Bengaluru, India
β¨ $35k-$50k / yearest.
full-timemid Hybridfinance
π Tech Stack
πΌ About This Role
You'll design resilient systems and define SLOs that reflect customer experience. Use Datadog and CloudWatch to build signal-heavy observability and improve the incident lifecycle. You'll combine software fundamentals with reliability thinking to keep systems highly available and easy to debug.
π― What You'll Do
- Design systems with resilience, graceful degradation, and capacity planning.
- Define and measure SLOs and SLIs that reflect customer experience.
- Build observability using Datadog and CloudWatch for alerting and monitoring.
- Continuously improve the incident lifecycle from detection to blameless follow-ups.
π Requirements
- 3+ years of experience in an SRE or Software Engineering role.
- Hands-on coding experience in two programming languages.
- Experience managing production environments with observability tools.
- Experience using SLOs and SLIs to guide decisions and prioritize work.
β¨ Nice to Have
- Experience with AI-assisted development tools like GitHub Copilot or Cursor.
- Built or contributed to agentic AI workflows for runbook automation or alert triage.
- Familiarity with incident.io or similar incident management platforms.
π Benefits & Perks
- π₯ Healthcare coverage
- π± Internet/cell phone reimbursement
- π Learning and development stipend
- βοΈ Opportunities to travel to Palo Alto HQ and Bangkok Site
π¨ Hiring Process
Estimated timeline: 2-4 weeks Β· AI estimate
- 1Recruiter CallΒ· 30 min
- 2Technical InterviewΒ· 60 min
- 3On-site InterviewΒ· half day
0 0 0