2h ago
Member of Technical Staff - Data Quality Engineer (Pre-training)
San Francisco
✨ $200k-$350k / yearest.
full-timeai-ml
🛠 Tech Stack
💼 About This Role
You'll own data quality for LLM pre-training at an AI research company building open superintelligence. You'll design automated quality checks and collaborate with researchers to turn data quality insights into measurable standards that impact model performance.
🎯 What You'll Do
- Own upstream data quality for LLM pre-training across languages and modalities
- Partner with research teams to translate requirements into measurable quality signals
- Design and validate automated QA methods for large-scale data campaigns
- Build reusable QA pipelines delivering high-quality data to pre-training teams
📋 Requirements
- Strong engineering fundamentals building data pipelines or QA systems
- Proficiency in Python and building ML/LLM workflows
- Experience with large datasets and automated evaluation systems
- Ability to translate quality concerns into concrete signals and feedback
✨ Nice to Have
- Experience with LLM-as-a-Judge or model-assisted quality checks
- Familiarity with how LLMs are trained and evaluated
- Excellent communication across teams
🎁 Benefits & Perks
- 💰 Top-tier compensation with salary and equity
- 🏥 Comprehensive health & wellness coverage
- 👶 Fully paid parental leave and family planning support
- 🏖️ Paid time off and relocation support
- 🍽️ Daily lunch and dinner plus team off-sites
0 0 0