ML Systems Engineer at Periodic Labs

2d ago

ML Systems Engineer

Menlo Park

$300k-$400k / year

full-timeleadai-ml Visa Sponsor

🛠 Tech Stack

💼 About This Role

You'll build the systems layer that makes frontier model training and inference fast and tightly coupled to the RL feedback loop for scientific discovery. You'll go deep into scheduling, kernels, RDMA, and weight synchronization while working with researchers to co-design algorithms and infrastructure. The speed of the RL loop directly multiplies the pace of discovery.

🎯 What You'll Do

Build rack and topology-aware scheduling for GPUs across Ray, Slurm, and Kubernetes
Implement direct S3 checkpoint streaming to eliminate I/O bottlenecks
Write and optimize communication and GPU kernels for maximum throughput
Design zero-copy RDMA weight synchronization between training and inference

📋 Requirements

Experience with large-scale inference infrastructure at production scale
Low-level systems programming with RDMA, NVLink, and kernel-level work
GPU cluster scheduling across Ray, Slurm, or Kubernetes
Writing and optimizing CUDA kernels for distributed training

✨ Nice to Have

Contributions to open source ML infrastructure projects like SGLang, Megatron-LM, vLLM
Experience working directly with ML researchers on algorithm-infrastructure co-design

🎁 Benefits & Perks

💰 Competitive compensation: $300k-$400k range
🏥 Health benefits (implied by startup environment)
🗽 Visa sponsorship available
🚀 Work at a cutting-edge AI company backed by top investors

📨 Hiring Process

Estimated timeline: 2-4 weeks · AI estimate

1Recruiter Phone Screen· 30 min
2Technical Screen· 45 min
3On-site Interviews· 4-5 hours

Periodic Labs

Periodic Labs Jobs

Other jobs at Periodic Labs

No other jobs found.

0 0 0