3h ago

Principal Machine Learning Engineer - Reliability

San Mateo, CA, United States
full-timeseniorOnline gaming / social platform

Description

As a Principal Machine Learning Engineer on the Reliability team, you will define the technical strategy for using ML to improve platform reliability, reduce incident detection and resolution times, and ship production-grade solutions.

Requirements

  • 8+ years designing large-scale ML systems in production
  • Proven track record of setting long-term technical direction for an ML domain
  • Deep expertise in Computer Vision and/or Vision-Language Models
  • Experience architecting scalable real-time ML inference and data pipelines
  • Strong product sense and strategic planning ability

Responsibilities

  • Define and own multi-year technical vision for ML in reliability
  • Collaborate with executive stakeholders to prioritize ML roadmap
  • Lead adoption of innovative ML techniques (transfer-learning, self-supervised learning, etc.)
  • Build end-to-end ML products from data pipelines to deployed solutions
  • Spend 30-40% of time on backend and integration work
0 views 0 saves 0 applications