1h ago

Senior Site Reliability Engineer- Observability

Bengaluru, India
full-timesenior Hybrididentity and access management

Tech Stack

Description

You will own and evolve Okta's observability ecosystem, architecting a scalable telemetry platform using Splunk and Grafana. You will automate infrastructure with Terraform and Go/Python/Ruby, and participate in incident response to drive systemic improvements.

Requirements

  • Deep hands-on Splunk administration and SPL expertise
  • Proven ability to build actionable Grafana dashboards
  • 3+ years SRE/DevOps experience with high-availability systems
  • Strong coding skills in Go, Python, or Ruby
  • Experience with OpenTelemetry, Prometheus, Linux, and Kubernetes

Responsibilities

  • Lead Splunk architecture optimization for performance and cost-efficiency
  • Architect and maintain sophisticated Grafana dashboards
  • Design and build scalable observability infrastructure using Terraform
  • Optimize telemetry data pipelines (Metrics, Logs, Traces)
  • Develop Splunk workflows to automate incident response
0 views 0 saves 0 applications