1h ago
Senior Site Reliability Engineer- Observability
Bengaluru, India
full-timesenior Hybrididentity and access management
Tech Stack
Description
You will own and evolve Okta's observability ecosystem, architecting a scalable telemetry platform using Splunk and Grafana. You will automate infrastructure with Terraform and Go/Python/Ruby, and participate in incident response to drive systemic improvements.
Requirements
- Deep hands-on Splunk administration and SPL expertise
- Proven ability to build actionable Grafana dashboards
- 3+ years SRE/DevOps experience with high-availability systems
- Strong coding skills in Go, Python, or Ruby
- Experience with OpenTelemetry, Prometheus, Linux, and Kubernetes
Responsibilities
- Lead Splunk architecture optimization for performance and cost-efficiency
- Architect and maintain sophisticated Grafana dashboards
- Design and build scalable observability infrastructure using Terraform
- Optimize telemetry data pipelines (Metrics, Logs, Traces)
- Develop Splunk workflows to automate incident response
0 views 0 saves 0 applications