15h ago

Lead Cloud Infrastructure Engineer / Site Reliability Engineer

North America

โœจ $175k-$225k / yearest.

full-timeleadcybersecurity

๐Ÿ›  Tech Stack

+2

๐Ÿ’ผ About This Role

You'll ensure the stability, performance, and security of our Federal region's cloud platform. You'll manage infrastructure with a focus on availability and incident response while maintaining a FedRAMP-compliant environment.

๐ŸŽฏ What You'll Do

  • Collaborate with engineering teams to ensure reliability and security of Federal infrastructure
  • Design and scale AI/ML/LLM infrastructure across AWS, Azure, or GCP
  • Manage and optimize Kubernetes environments for AI services and data pipelines
  • Participate in 24x7 on-call rotations and lead incident response

๐Ÿ“‹ Requirements

  • 8+ years in SRE, DevOps, Platform Engineering, MLOps, or Cloud Infrastructure roles
  • 4+ years production experience with Kubernetes (EKS, GKE, AKS) and Docker
  • Strong programming skills in Python and proficiency in Bash, Go, or PowerShell
  • Proficiency with Infrastructure-as-Code tools (Terraform, CloudFormation)

โœจ Nice to Have

  • Cloud certifications (AWS, Azure, or GCP)
  • Experience with agentic AI frameworks (CrewAI, LangGraph, AutoGen)
  • Background in hybrid or on-prem AI deployments including OpenShift or Rancher

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Unlimited PTO
  • ๐Ÿฅ Health insurance
  • ๐Ÿ’ฐ Equity
  • ๐Ÿ  Remote work
  • ๐Ÿ“š Learning budget

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter callยท 30 min
  2. 2Technical interviewยท 60 min
  3. 3Hiring manager interviewยท 45 min
0 0 0