3h ago
Principal Machine Learning Engineer - Reliability
San Mateo, CA, United States
full-timeseniorOnline gaming / social platform
Description
As a Principal Machine Learning Engineer on the Reliability team, you will define the technical strategy for using ML to improve platform reliability, reduce incident detection and resolution times, and ship production-grade solutions.
Requirements
- 8+ years designing large-scale ML systems in production
- Proven track record of setting long-term technical direction for an ML domain
- Deep expertise in Computer Vision and/or Vision-Language Models
- Experience architecting scalable real-time ML inference and data pipelines
- Strong product sense and strategic planning ability
Responsibilities
- Define and own multi-year technical vision for ML in reliability
- Collaborate with executive stakeholders to prioritize ML roadmap
- Lead adoption of innovative ML techniques (transfer-learning, self-supervised learning, etc.)
- Build end-to-end ML products from data pipelines to deployed solutions
- Spend 30-40% of time on backend and integration work
0 views 0 saves 0 applications