1d ago

Infrastructure Engineer (Storage)

New York, New York, United States; Remote; San Francisco, California, United States; Seattle, Washington, United States

$180k-$200k / year

full-timesenior Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll build and operate storage systems for large-scale AI/ML training and inference workloads. You'll help scale distributed storage across bare-metal infrastructure, ensuring high throughput and low latency. This role offers a chance to work with cutting-edge AI infrastructure at a well-funded company.

🎯 What You'll Do

  • Operate and scale distributed storage systems (VAST, Ceph).
  • Optimize storage performance for high-throughput AI workloads.
  • Build and maintain automation for provisioning and monitoring.
  • Troubleshoot complex storage and data path issues.

📋 Requirements

  • 5+ years of experience in infrastructure or systems engineering.
  • Hands-on experience operating distributed storage systems (e.g., VAST, Ceph).
  • Strong Linux systems experience in production.
  • Proficiency in Python for automation.

✨ Nice to Have

  • Experience with VAST storage systems in production.
  • Familiarity with AI/ML or HPC workloads.
  • Experience with RDMA, GPU Direct Storage.

🎁 Benefits & Perks

  • 🏥 Comprehensive medical, dental, and vision coverage
  • 💰 Retirement and financial wellness support
  • 🏢 Equity component
  • 🌴 Discretionary bonus
  • 🌍 Remote/hybrid flexibility

📨 Hiring Process

Estimated timeline: 2-4 weeks · AI estimate

  1. 1Recruiter screen· 30 min
  2. 2Technical interview· 60 min
  3. 3Hiring manager interview· 45 min
0 0 0