2h ago
Member of Technical Staff - Safety Lead
San Francisco
✨ $200k-$300k / yearest.
full-timeseniorai-ml
🛠 Tech Stack
💼 About This Role
You'll own the red-teaming and adversarial evaluation pipeline for Reflection's open-weight AI models, probing for failure modes across security, misuse, and alignment gaps. You'll work hand-in-hand with the Alignment team to translate safety findings into concrete guardrails for every release. This role is a critical gatekeeper for open-weight model shipping.
🎯 What You'll Do
- Own the red-teaming and adversarial evaluation pipeline for models.
- Translate safety findings into concrete guardrails with Alignment team.
- Validate that every release meets risk thresholds before shipping.
- Develop scalable, automated safety benchmarks and adversarial tests.
📋 Requirements
- Graduate degree (MS/PhD) in CS, ML, or equivalent in AI Safety.
- Deep technical understanding of LLM safety and adversarial attacks.
- Strong software engineering for automated evaluation pipelines.
- Experience with RLHF/RLAIF preferred.
✨ Nice to Have
- Experience with reinforcement learning (RLHF/RLAIF).
- Experience building large-scale ML systems.
🎁 Benefits & Perks
- 💰 Top-tier compensation with salary and equity.
- 🏥 Comprehensive health & wellness including medical, dental, vision.
- 👶 Fully paid parental leave and family planning support.
- 🏖️ Paid time off and relocation support.
- 🍽️ Daily lunch and dinner provided, plus off-sites.
0 0 0