5h ago
Member of Technical Staff - Compute Platform
New York
✨ $200k-$350k / yearest.
full-timemidai-ml
🛠 Tech Stack
💼 About This Role
You'll join Reflection's Compute Platform team to keep our compute layer healthy and highly available across multiple neo-clouds. You'll work on K8s-based cluster management, multi-cloud scheduling, and performance debugging at scale. This role offers genuinely hard systems problems at the frontier of AI infrastructure.
🎯 What You'll Do
- Build and maintain tools for automatic remediation and topology-aware scheduling
- Design and iterate on the cluster management stack for multi-GPU fleets
- Implement comprehensive cluster-wide monitoring and performance benchmarking
- Prepare infrastructure for next-generation GPU deployments and larger clusters
📋 Requirements
- Systems-level engineering experience with cluster-wide behavior
- Strong coding ability with focus on systems or GPU infrastructure
- Deep GPU hardware knowledge including NCCL
- K8s-first architecture alignment
✨ Nice to Have
- Cloud storage expertise with high-performance data products like VAST
- Experience with petabyte-scale data replication
- Knowledge of multi-cloud scheduling
🎁 Benefits & Perks
- 💰 Top-tier compensation with salary and equity
- 🏥 Comprehensive medical, dental, vision insurance
- 👶 Fully paid parental leave for all new parents
- 🍽️ Daily lunch and dinner provided
- 🏖️ Paid time off and relocation support
0 0 0