9h ago
Senior Site Reliability Engineer
Pune, India
$4800k-$8000k / year
full-timeseniorai-ml
๐ Tech Stack
๐ผ About This Role
You'll design and operate scalable, fault-tolerant infrastructure for an AI-powered software development platform at Blitzy, a fast-growing U.S. GenAI startup with a strong Pune office. You'll ensure high availability and performance as you reduce MTTR and increase system uptime through hands-on contributions. This is a high-impact role where you'll shape the culture and technical standards of a new SRE team.
๐ฏ What You'll Do
- Design and build fault-tolerant cloud infrastructure on AWS/GCP/Azure
- Define SLOs, SLAs, and lead blameless postmortems
- Maintain CI/CD pipelines and deployment automation
- Own observability stack including Prometheus, Grafana, and Datadog
- Partner with engineering teams to embed reliability practices
๐ Requirements
- 5+ years of SRE, DevOps, or Infrastructure Engineering experience
- Strong proficiency in AWS and Kubernetes at scale
- Hands-on experience with Terraform or Pulumi
- Deep expertise in observability tooling and incident management
โจ Nice to Have
- Experience with AI/ML workloads or GPU-accelerated infrastructure
- Prior experience in a high-growth startup wearing multiple hats
- Familiarity with eBPF or service mesh technologies like Istio
๐ Benefits & Perks
- ๐ฐ Competitive equity eligibility based on performance
- ๐๏ธ Everyday athlete culture promoting sleep, movement, and mental performance
- ๐ Greenfield AI platform with direct influence on architectural decisions
- ๐ Founding member of Pune SRE team with growth opportunity
๐จ Hiring Process
Estimated timeline: 3-5 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3System Design Interviewยท 60 min
- 4Hiring Manager Chatยท 45 min
- 5Offerยท 1 week
This description was AI-summarized. View original
0 0 0