9h ago
Member of Technical Staff, Generalist
Remote
✨ $150k-$250k / yearest.
full-time Remoteai-ml Visa Sponsor
🛠 Tech Stack
💼 About This Role
You'll work across the entire vLLM stack, from low-level GPU kernels to high-level distributed systems. Your work will directly impact how the world runs AI inference by optimizing serving performance at global scale.
🎯 What You'll Do
- Optimize CUDA kernels and GPU memory management
- Design distributed orchestration for inference at scale
- Implement new model architectures in vLLM
- Build cloud automation and monitoring infrastructure
📋 Requirements
- Bachelor's degree in CS or equivalent experience
- Deep expertise in systems, GPU, distributed systems, or ML infra
- Strong track record of shipping high-impact work in complex environments
- Proficiency in at least two: CUDA, Rust/Go/C++, Python/PyTorch, K8s
✨ Nice to Have
- Contributions to vLLM or major open-source ML projects
- Experience with multiple accelerator platforms (NVIDIA, AMD, TPU, Intel)
- Knowledge of quantization or compiler technologies
🎁 Benefits & Perks
- 💵 Competitive salary + equity
- 🏥 Health coverage (location-dependent)
- 🌍 Fully remote work with global flexibility
- 🛂 Visa sponsorship case-by-case
0 0 0