Software Engineer, Inference – AMD GPU Enablement at OpenAI — CareerPair

5h ago

Software Engineer, Inference – AMD GPU Enablement

San Francisco

$295k-$555k / year

full-timeseniorai-ml

🛠 Tech Stack

💼 About This Role

You'll scale and optimize OpenAI's inference infrastructure across emerging GPU platforms. Your core impact will be ensuring large models run smoothly on AMD hardware through low-level kernel performance and distributed execution. This high-impact role lets you shape multi-platform inference capabilities from the ground up.

🎯 What You'll Do

Own bring-up, correctness and performance of inference stack on AMD hardware.
Integrate model-serving infrastructure (e.g., vLLM, Triton) into GPU-backed systems.
Debug and optimize distributed inference workloads across memory, network, compute.
Design and optimize high-performance GPU kernels using HIP, Triton, or CUDA.

📋 Requirements

GPU kernel experience with HIP, CUDA, or Triton.
Familiarity with communication libraries like NCCL/RCCL.
Experience with distributed inference systems and scaling across GPUs.
Proven ability to solve end-to-end performance challenges across hardware and software layers.

✨ Nice to Have

Contributions to open-source libraries like RCCL, Triton, or vLLM.
Experience with GPU performance tools (Nsight, rocprof, perf).
Experience deploying inference on non-NVIDIA GPU environments.

🎁 Benefits & Perks

💰 Competitive salary ($295K–$555K plus equity)
🏖️ Flexible PTO
💻 Remote-friendly (optional)
📈 Equity grants
🏥 Health insurance

OpenAI

OpenAI Jobs

Other jobs at OpenAI

No other jobs found.

0 0 0