7h ago
Research Engineer/Research Scientist – Model Transparency
London, UK
✨ $100k-$200k / yearest.
full-timeai-ml
💼 About This Role
You'll join the Model Transparency team at the AI Security Institute, researching how oversight of frontier AI systems can remain reliable as models become less transparent. You'll develop methods to detect and measure risks like evaluation awareness and unfaithful reasoning, influencing both AI companies and government policy. This role offers direct impact on global AI safety.
🎯 What You'll Do
- Design and run experiments on open-weight models to study alignment phenomena
- Develop chain-of-thought monitorability benchmarks for frontier systems
- Build tooling and infrastructure for agent orchestration and RL pipelines
- Review frontier lab risk assessments and safety cases
📋 Requirements
- Get-things-done mindset with ownership and fast execution
- Self-sufficiency and teamwork, defining own agenda while contributing to shared goals
- Ability to build and orchestrate AI agents for effective task completion
- Demonstrated track record of relevant high-quality work (publications, blog posts, etc.)
✨ Nice to Have
- Experience with interpretability methods like sparse autoencoders or probes
- Background in capability or alignment evaluations
- Prior work collaborating with frontier AI labs
🎁 Benefits & Perks
- 🔬 Work with world-leading AI safety researchers from Anthropic, OpenAI, DeepMind
- 🏛️ Direct lines to UK government and No. 10, influencing policy
- 🌍 International collaboration with allied governments and frontier labs
- 🚀 Unique resources and agility to shape AI development and government action
📨 Hiring Process
Estimated timeline: 2-4 weeks · AI estimate
- 1Recruiter Call· 30 min
- 2Technical Interview· 60 min
- 3Team Fit Interview· 45 min
0 0 0