15h ago
Lead Cloud Infrastructure Engineer / Site Reliability Engineer
North America
โจ $175k-$225k / yearest.
full-timeleadcybersecurity
๐ Tech Stack
+2
๐ผ About This Role
You'll ensure the stability, performance, and security of our Federal region's cloud platform. You'll manage infrastructure with a focus on availability and incident response while maintaining a FedRAMP-compliant environment.
๐ฏ What You'll Do
- Collaborate with engineering teams to ensure reliability and security of Federal infrastructure
- Design and scale AI/ML/LLM infrastructure across AWS, Azure, or GCP
- Manage and optimize Kubernetes environments for AI services and data pipelines
- Participate in 24x7 on-call rotations and lead incident response
๐ Requirements
- 8+ years in SRE, DevOps, Platform Engineering, MLOps, or Cloud Infrastructure roles
- 4+ years production experience with Kubernetes (EKS, GKE, AKS) and Docker
- Strong programming skills in Python and proficiency in Bash, Go, or PowerShell
- Proficiency with Infrastructure-as-Code tools (Terraform, CloudFormation)
โจ Nice to Have
- Cloud certifications (AWS, Azure, or GCP)
- Experience with agentic AI frameworks (CrewAI, LangGraph, AutoGen)
- Background in hybrid or on-prem AI deployments including OpenShift or Rancher
๐ Benefits & Perks
- ๐๏ธ Unlimited PTO
- ๐ฅ Health insurance
- ๐ฐ Equity
- ๐ Remote work
- ๐ Learning budget
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter callยท 30 min
- 2Technical interviewยท 60 min
- 3Hiring manager interviewยท 45 min
0 0 0