4h ago
Member of Technical Staff - Site Reliability
Remote
✨ $140k-$200k / yearest.
full-time Remoteai-ml
🛠 Tech Stack
💼 About This Role
You'll own the reliability, performance, and scalability of Runlayer's enterprise MCP platform infrastructure. Working closely with founders and a senior engineering team, you'll directly enable AI adoption at scale through end-to-end ownership of cloud infrastructure and incident response. This role sits at the center of how AI gets things done in enterprises.
🎯 What You'll Do
- Own reliability and performance of cloud infrastructure across AWS and GCP
- Manage and optimize Kubernetes clusters and container orchestration
- Drive database reliability engineering including performance tuning and scaling
- Build and maintain CI/CD pipelines for rapid safe deployments
- Run incident response and on-call rotations
📋 Requirements
- Strong AWS experience particularly ECS Aurora and CloudWatch
- GCP experience as we expand cross-cloud
- Kubernetes and container orchestration expertise
- DBRE experience with database performance tuning
- CI/CD pipeline ownership and incident response experience
✨ Nice to Have
- Experience deploying and supporting on-prem or hybrid environments
- Python backend familiarity (our platform is Python-based)
- Experience at an early-stage or high-growth company
🎁 Benefits & Perks
- 💰 Competitive salary and equity
- 🏖️ 4 weeks paid vacation plus paid sick leave and parental leave
- 📚 Professional development budget for conferences courses certifications
- 💻 Top-tier equipment your choice of laptop and accessories
- 🏥 Comprehensive health dental and vision coverage
📨 Hiring Process
0 0 0