3h ago

Senior Director, Platform Engineering & Reliability

Remote - California; Remote - Colorado; Remote - Illinois; Remote - Massachusetts; Remote - North Carolina; Remote - Texas; Remote - Washington
full-timedirector Remoteautomotive repair shop management software

Tech Stack

Description

You will own the reliability, scalability, security, and operational excellence of Shopmonkey's platform, sitting at the intersection of engineering and infrastructure. You'll be deeply embedded in the technical and organizational fabric, leading teams that build and maintain developer tooling, CI/CD pipelines, and cloud infrastructure on GCP. This role manages multi-year roadmaps, incident management, and compliance programs, partnering closely with Product, Engineering, and Security to translate business priorities into operational reality.

Requirements

  • 10+ years in infrastructure, platform engineering, SRE, or DevOps with 4-5 years in senior leadership
  • Proven experience owning production reliability at a SaaS company with significant uptime requirements
  • Track record building DevOps/platform engineering organizations from the ground up or through transformation
  • Experience leading cloud-native infrastructure on GCP, AWS, or Azure with Kubernetes and container orchestration
  • Prior ownership of security and compliance programs (SOC 2, ISO 27001, or similar)

Responsibilities

  • Define and enforce SLOs, SLAs, and error budgets; lead incident management end-to-end
  • Drive observability maturity using OpenTelemetry, Prometheus, and dashboards
  • Own GCP-based cloud infrastructure: provisioning, scaling, cost optimization
  • Lead compliance programs for SOC 2 Type II and beyond; build security into development lifecycle
  • Build, retain, and grow a high-performing team spanning DevOps, SRE, Platform Engineering, and Infrastructure
0 views 0 saves 0 applications