5h ago
Senior Machine Learning Engineer - Training Platform
Australia
โจ $250k-$400k / yearest.
full-timesenior Remoteai-ml
๐ Tech Stack
๐ผ About This Role
You'll join a high-impact AI Platform group building foundational systems for large-scale model training. You'll design and evolve infrastructure that enables distributed AI training workloads to run reliably and efficiently at scale. Your work will directly support research scientists and ML engineers in deploying advanced AI capabilities.
๐ฏ What You'll Do
- Design and scale core training platform infrastructure for distributed AI workloads
- Improve reliability, observability, and debugging of large-scale training systems
- Enhance scheduling, resource allocation, and quota management for AI training jobs
- Collaborate with ML engineers and researchers to optimize training workflows
๐ Requirements
- Strong experience in machine learning infrastructure or distributed systems
- Hands-on expertise with Kubernetes and containerized environments
- Familiarity with distributed training frameworks like Ray or PyTorch distributed training
- Experience working with cloud infrastructure for high-performance workloads
โจ Nice to Have
- Experience with Ray or PyTorch distributed training
- Background in high-performance computing (HPC) environments
๐ Benefits & Perks
- ๐ Equity packages to share in long-term success
- ๐ถ Inclusive parental leave supporting all parents
- ๐ฐ Annual wellbeing allowance for personal and professional needs
- ๐๏ธ Flexible leave options for rest and recharge
- ๐ Remote-friendly working model within Australia
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Screenยท 30 min
- 2Technical Interviewยท 60 min
- 3System Design Interviewยท 60 min
0 0 0