12h ago
Machine Learning Infrastructure Engineer
United States
$180k-$300k / year
full-timemidfinance
๐ Tech Stack
+1
๐ผ About This Role
You'll design high-performance infrastructure for large-scale generative AI and ML workloads at a leading hedge fund. Your work will directly impact model iteration speed and production reliability. This role offers hands-on experience with state-of-the-art GPU clusters and distributed systems.
๐ฏ What You'll Do
- Design and implement high-performance infrastructure for GenAI/ML workloads
- Operate distributed systems for model training, inference, and data pipelines
- Develop and automate CI/CD pipelines for models and data workflows
- Implement observability, monitoring, and cost-management for GPU environments
๐ Requirements
- 3โ7 years experience building ML or compute infrastructure systems
- Deep understanding of distributed systems, Kubernetes, and public cloud (AWS/GCP/Azure)
- Hands-on experience with MLflow, Ray, Airflow, Kubeflow, or Terraform
- Proficiency in Python and systems programming (Go, C++, or Rust)
โจ Nice to Have
- Understanding of reinforcement learning concepts
- Strong debugging and performance profiling skills across GPU/CPU stacks
- Experience with cost-optimization for GPU-based environments
๐ Benefits & Perks
- ๐๏ธ Fully-paid health care benefits
- ๐ถ Generous parental and family leave policies
- ๐ Tuition assistance
- ๐ฐ 401(k) savings program with employer match
- ๐ง Mental and physical wellness programs
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Phone Interviewยท 60 min
- 3Onsite Interviews (2-3 rounds)ยท 3 hours total
0 0 0