Tech Lead, AI Compute Infrastructure at Jobs at HeyGen

5h ago

Tech Lead, AI Compute Infrastructure

Los Angeles, Palo Alto, San Francisco, Toronto, Singapore

full-timeseniorArtificial Intelligence

Tech Stack

Description

You will build and scale the compute infrastructure powering HeyGen's state-of-the-art AI models, directly impacting model performance and video generation quality. You'll optimize GPU utilization across thousands of devices, develop scalable frameworks for compute jobs, and collaborate closely with AI researchers.

Requirements

Bachelor's degree in CS or related field, or equivalent experience
5+ years of industry experience in large-scale MLOps, AI infrastructure, or HPC
Experience with data frameworks like Ray, Apache Spark, LanceDB
Proficiency in Python and C++
Deep experience with Kubernetes and Ray

Responsibilities

Optimize GPU utilization across thousands of devices for inference, training, and data processing
Build scalable frameworks for managing heterogeneous compute jobs including data ingestion, training, and evaluation
Develop observability and tracing tools for compute clusters to diagnose performance bottlenecks
Collaborate with AI researchers to integrate acceleration techniques into production pipelines
Champion cloud and container tech (Kubernetes, Ray) for elastic scaling of distributed systems

Jobs at HeyGen

Other jobs at Jobs at HeyGen

No other jobs found.

0 views 0 saves 0 applications