1d ago
Production Engineer/Site Reliability Engineer (Shift Basis)
Bangalore
full-timemidcybersecurity
Tech Stack
Description
Join a 24/7 Production Operations team to manage critical infrastructure across multi-cloud environments, lead incident response, and drive automation to improve system reliability and uptime.
Requirements
- Solid understanding of distributed system concepts
- Experience with production systems in public cloud infrastructures
- Familiarity with Kubernetes and container orchestration
- Hands-on experience with Terraform and CloudFormation
- Proficient in Python, UNIX, networking, and databases like MySQL
Responsibilities
- Manage and support critical infrastructure in multi-cloud environments
- Implement observability solutions for monitoring, alerting, and metrics
- Lead incident management, coordinate resolution across teams
- Analyze recurring incidents to identify root causes and reduce toil
- Design automation tools to detect, triage, and remediate production issues
0 views 0 saves 0 applications