1d ago
Infrastructure Engineer (GPU & Compute)
United States
$180k-$200k / year
full-timesenior Remoteai-ml
๐ Tech Stack
๐ผ About This Role
You'll own image management, deployment, and validation across GPU-enabled infrastructure for AI workloads. You'll lead system validation and diagnostics to ensure production-ready environments from day one. Collaborate with cross-functional teams to scale compute for the future of AI.
๐ฏ What You'll Do
- Evolve image management and deployment systems for GPU environments
- Lead GPU diagnostics and validation workflows
- Build automation tools in Python for provisioning
- Support hardware qualification for new platforms
๐ Requirements
- 5+ years of infrastructure engineering experience
- Strong Linux systems administration in production
- Hands-on experience with GPU-enabled systems and NVIDIA DCGM
- Proficiency in Python for automation
โจ Nice to Have
- Experience with high-performance interconnects (InfiniBand, NVLink)
- AI/ML or HPC workload experience
- Large-scale hardware validation frameworks
๐ Benefits & Perks
- ๐ฐ Competitive salary $180k-$200k
- ๐ Performance bonus and equity
- ๐ฉบ Medical, dental, vision coverage
- ๐๏ธ Generous PTO and paid parental leave
- ๐ก Flexible remote work options
๐จ Hiring Process
Estimated timeline: 2-3 weeks ยท AI estimate
- 1Recruiter screenยท 30 min
- 2Technical interviewยท 60 min
- 3Hiring manager roundยท 45 min
0 0 0