1d ago
Infrastructure Engineer (Storage)
New York, New York, United States; Remote; San Francisco, California, United States; Seattle, Washington, United States
$180k-$200k / year
full-timesenior Remoteai-ml
🛠 Tech Stack
💼 About This Role
You'll build and operate storage systems for large-scale AI/ML training and inference workloads. You'll help scale distributed storage across bare-metal infrastructure, ensuring high throughput and low latency. This role offers a chance to work with cutting-edge AI infrastructure at a well-funded company.
🎯 What You'll Do
- Operate and scale distributed storage systems (VAST, Ceph).
- Optimize storage performance for high-throughput AI workloads.
- Build and maintain automation for provisioning and monitoring.
- Troubleshoot complex storage and data path issues.
📋 Requirements
- 5+ years of experience in infrastructure or systems engineering.
- Hands-on experience operating distributed storage systems (e.g., VAST, Ceph).
- Strong Linux systems experience in production.
- Proficiency in Python for automation.
✨ Nice to Have
- Experience with VAST storage systems in production.
- Familiarity with AI/ML or HPC workloads.
- Experience with RDMA, GPU Direct Storage.
🎁 Benefits & Perks
- 🏥 Comprehensive medical, dental, and vision coverage
- 💰 Retirement and financial wellness support
- 🏢 Equity component
- 🌴 Discretionary bonus
- 🌍 Remote/hybrid flexibility
📨 Hiring Process
Estimated timeline: 2-4 weeks · AI estimate
- 1Recruiter screen· 30 min
- 2Technical interview· 60 min
- 3Hiring manager interview· 45 min
0 0 0