4h ago

Member of Technical Staff - Site Reliability

Remote

$140k-$200k / yearest.

full-time Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll own the reliability, performance, and scalability of Runlayer's enterprise MCP platform infrastructure. Working closely with founders and a senior engineering team, you'll directly enable AI adoption at scale through end-to-end ownership of cloud infrastructure and incident response. This role sits at the center of how AI gets things done in enterprises.

🎯 What You'll Do

  • Own reliability and performance of cloud infrastructure across AWS and GCP
  • Manage and optimize Kubernetes clusters and container orchestration
  • Drive database reliability engineering including performance tuning and scaling
  • Build and maintain CI/CD pipelines for rapid safe deployments
  • Run incident response and on-call rotations

📋 Requirements

  • Strong AWS experience particularly ECS Aurora and CloudWatch
  • GCP experience as we expand cross-cloud
  • Kubernetes and container orchestration expertise
  • DBRE experience with database performance tuning
  • CI/CD pipeline ownership and incident response experience

✨ Nice to Have

  • Experience deploying and supporting on-prem or hybrid environments
  • Python backend familiarity (our platform is Python-based)
  • Experience at an early-stage or high-growth company

🎁 Benefits & Perks

  • 💰 Competitive salary and equity
  • 🏖️ 4 weeks paid vacation plus paid sick leave and parental leave
  • 📚 Professional development budget for conferences courses certifications
  • 💻 Top-tier equipment your choice of laptop and accessories
  • 🏥 Comprehensive health dental and vision coverage

📨 Hiring Process

[email protected]

0 0 0