Software Engineer, Workload Enablement at OpenAI

6h ago

Software Engineer, Workload Enablement

San Francisco

$293k-$455k / year

full-timesenior Hybridai-ml

🛠 Tech Stack

💼 About This Role

You'll join the Scaling team to enable production workloads and end-to-end testing on new AI platforms. You'll port inference and training workloads, build benchmarks, and deep-dive into performance bottlenecks for distributed training at scale. This role offers direct impact on the infrastructure behind cutting-edge AI models.

🎯 What You'll Do

Port and validate inference/training workloads on new platforms
Build benchmark suites capturing real end-to-end behavior
Deep-dive performance on distributed training/inference
Create test harnesses running in CI/lab environments

📋 Requirements

BS in CS/EE or equivalent practical experience
5+ years in ML systems, performance engineering, distributed systems, or HPC
Strong hands-on experience with PyTorch and modern LLM stacks
Experience with RDMA and debugging/optimizing comms libraries (NCCL/RCCL)

✨ Nice to Have

Experience building workload-shaped benchmarks and stress tests
Familiarity with RDMA networking and transport tuning
Experience running workloads in Kubernetes

🎁 Benefits & Perks

💰 Competitive equity included in total compensation
🏖️ Flexible time off
🏥 Comprehensive health insurance

OpenAI

OpenAI Jobs

Other jobs at OpenAI

No other jobs found.

0 0 0