3h ago
ML Infrastructure Engineer, Safeguards
San Francisco, CA
$320,000-$405,000 / year
full-timeseniorArtificial Intelligence
Tech Stack
Description
You will design and build scalable ML infrastructure to power real-time and batch safety evaluations for AI systems, directly contributing to making AI more trustworthy. You will collaborate with research teams to productionize safety research and build monitoring tools for safety-critical applications.
Requirements
- 5+ years building production ML infrastructure
- Proficient in Python; experience with PyTorch, TensorFlow, or JAX
- Hands-on experience with cloud platforms (AWS, GCP) and Kubernetes
- Understanding of distributed systems and high-throughput/low-latency workloads
- Experience with data engineering tools (Spark, Airflow, streaming systems)
Responsibilities
- Design and build scalable ML infrastructure for real-time and batch classifier and safety evaluations
- Build monitoring and observability tools for model performance and data quality
- Collaborate with research teams to productionize safety research
- Optimize inference latency and throughput for real-time safety evaluations
- Implement automated testing, deployment, and rollback systems for ML models in production
0 views 0 saves 0 applications