3h ago

ML Infrastructure Engineer, Safeguards

San Francisco, CA

$320,000-$405,000 / year

full-timeseniorArtificial Intelligence

Tech Stack

Description

You will design and build scalable ML infrastructure to power real-time and batch safety evaluations for AI systems, directly contributing to making AI more trustworthy. You will collaborate with research teams to productionize safety research and build monitoring tools for safety-critical applications.

Requirements

  • 5+ years building production ML infrastructure
  • Proficient in Python; experience with PyTorch, TensorFlow, or JAX
  • Hands-on experience with cloud platforms (AWS, GCP) and Kubernetes
  • Understanding of distributed systems and high-throughput/low-latency workloads
  • Experience with data engineering tools (Spark, Airflow, streaming systems)

Responsibilities

  • Design and build scalable ML infrastructure for real-time and batch classifier and safety evaluations
  • Build monitoring and observability tools for model performance and data quality
  • Collaborate with research teams to productionize safety research
  • Optimize inference latency and throughput for real-time safety evaluations
  • Implement automated testing, deployment, and rollback systems for ML models in production
0 views 0 saves 0 applications