Member of Technical Staff, Inference at Genesis AI

9h ago

Member of Technical Staff, Inference

Bay Area

✨ $275k-$350k / yearest.

full-timesenior Remote

🛠 Tech Stack

💼 About This Role

You'll build low-latency inference pipelines for on-device deployment, enabling real-time control loops in robotics. You'll design distributed inference systems on GPU clusters and push throughput with efficient resource utilization. This role combines low-level CUDA and Triton with high-level framework integration.

🎯 What You'll Do

Build low-latency inference pipelines for on-device deployment
Design and optimize distributed inference systems on GPU clusters
Implement efficient low-level code (CUDA, Triton, custom kernels)
Develop monitoring and debugging tools for reliability and determinism

📋 Requirements

8+ years of experience in distributed systems or ML infrastructure
Production-grade expertise in Python and systems languages (C++/Rust/Go)
Low-level performance mastery in CUDA and Triton
Proven track record scaling inference workloads in cluster and on-device environments

✨ Nice to Have

Experience with quantization and memory scheduling
Knowledge of graph compilation techniques
Background in robotics or real-time systems

🎁 Benefits & Perks

💰 Competitive Equity
🏖️ Unlimited PTO
🏥 Health Insurance

Genesis AI

Genesis AI Jobs

Other jobs at Genesis AI

No other jobs found.

0 0 0