20h ago

Principal Site Reliability Engineer

Bengaluru

$150k-$200k / yearest.

full-timelead Hybridcybersecurity

🛠 Tech Stack

💼 About This Role

You'll own the long-term reliability strategy and architecture for Saviynt's AI-powered identity platform. You'll design planet-scale systems on AWS and Kubernetes and lead the development of autonomous operations powered by AI agents and LLM-driven SRE systems. This role combines deep technical expertise with cross-functional leadership to shape reliability culture across the company.

🎯 What You'll Do

  • Define long-term reliability strategy and architecture
  • Design highly resilient systems on AWS and Kubernetes (EKS)
  • Lead development of AI agent-driven autonomous operations platforms
  • Implement LLM-driven incident detection, triage, and self-healing systems

📋 Requirements

  • 10+ years in SRE / Platform / Distributed Systems Engineering
  • Deep expertise in AWS architecture at scale and Kubernetes internals
  • Strong programming skills in Python or Go for building platforms/tools
  • Experience leading cross-functional technical initiatives

✨ Nice to Have

  • Experience integrating LLMs into production systems (e.g., via OpenAI API)
  • Familiarity with agent frameworks like LangChain or AutoGen
  • Knowledge of RAG pipelines and vector databases
0 0 0