12h ago

Machine Learning Infrastructure Engineer

United States

$180k-$300k / year

full-timemidfinance

๐Ÿ›  Tech Stack

+1

๐Ÿ’ผ About This Role

You'll design high-performance infrastructure for large-scale generative AI and ML workloads at a leading hedge fund. Your work will directly impact model iteration speed and production reliability. This role offers hands-on experience with state-of-the-art GPU clusters and distributed systems.

๐ŸŽฏ What You'll Do

  • Design and implement high-performance infrastructure for GenAI/ML workloads
  • Operate distributed systems for model training, inference, and data pipelines
  • Develop and automate CI/CD pipelines for models and data workflows
  • Implement observability, monitoring, and cost-management for GPU environments

๐Ÿ“‹ Requirements

  • 3โ€“7 years experience building ML or compute infrastructure systems
  • Deep understanding of distributed systems, Kubernetes, and public cloud (AWS/GCP/Azure)
  • Hands-on experience with MLflow, Ray, Airflow, Kubeflow, or Terraform
  • Proficiency in Python and systems programming (Go, C++, or Rust)

โœจ Nice to Have

  • Understanding of reinforcement learning concepts
  • Strong debugging and performance profiling skills across GPU/CPU stacks
  • Experience with cost-optimization for GPU-based environments

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Fully-paid health care benefits
  • ๐Ÿ‘ถ Generous parental and family leave policies
  • ๐ŸŽ“ Tuition assistance
  • ๐Ÿ’ฐ 401(k) savings program with employer match
  • ๐Ÿง˜ Mental and physical wellness programs

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Phone Interviewยท 60 min
  3. 3Onsite Interviews (2-3 rounds)ยท 3 hours total
0 0 0