Senior / Staff AI Research Engineer, Real-Time Inference at Jobs at RoboForce

2h ago

Senior / Staff AI Research Engineer, Real-Time Inference

Milpitas, CA

full-timeseniorrobotics

Tech Stack

Description

In this role, you will drive the full stack of model optimization—from CUDA kernel engineering to quantization and compression—to deploy high-performance AI models on edge compute platforms powering RoboForce robots in the field.

Requirements

Master's degree in CS, EE, or related field with 4+ years experience, or PhD.
Deep expertise in CUDA, GPU architecture, and low-level kernel optimization.
Hands-on experience with TensorRT, ONNX Runtime, TVM, or Triton for model quantization and deployment.
Proficiency in C++ and Python with strong systems programming skills.
Experience deploying ML models on edge/embedded hardware (e.g., NVIDIA Jetson, Orin).

Responsibilities

Develop and optimize real-time inference pipelines for embodied AI models on edge hardware (e.g., NVIDIA Jetson).
Implement CUDA-level custom kernels, memory layout tuning, and hardware-aware graph compilation to minimize latency.
Apply model compression techniques including quantization, pruning, distillation, and structured sparsity.
Profile and debug inference stacks using tools like NSight, TensorRT, and Triton to eliminate performance bottlenecks.
Collaborate with ML research and robotics teams to co-design architectures meeting real-time control-loop requirements.

Jobs at RoboForce

Current openings at RoboForce

Other jobs at Jobs at RoboForce

No other jobs found.

0 views 0 saves 0 applications