Senior Site Reliability Engineer - Observability at Careers — CareerPair

1h ago

Senior Site Reliability Engineer - Observability

Bellevue, Washington

$147,000-$202,000 / year

full-timesenior HybridIdentity and Access Management / Security

Tech Stack

Description

You will own and evolve our Splunk ecosystem, moving beyond simple monitoring to deliver a world-class Observability Platform. You'll treat infrastructure as code using Terraform and automate agent deployments across complex distributed systems.

Requirements

5+ years scaling and managing Splunk Cloud (1000+ SVCs) including WLM and HEC optimization
Expertise in creating actionable Splunk dashboards correlating data from multiple sources
3+ years SRE, DevOps, or Systems Engineering experience focusing on high-availability systems
Strong coding skills in SPL and Go for building internal tools and automation
Deep understanding of Linux internals, networking, and Kubernetes/EKS

Responsibilities

Design, build, and maintain scalable observability infrastructure using Terraform
Optimize collection, processing, and storage of log data in Splunk for high reliability and low latency
Participate in on-call rotations and lead post-incident reviews
Eliminate toil by automating deployment and scaling of observability agents and collectors

Careers

Help us build the next generation of corporate IT by bringing your talent and motivation to Okta, the leader in identity and access management.

Other jobs at Careers

No other jobs found.

0 views 0 saves 0 applications