Research Engineer/Research Scientist – Model Transparency at Jobs at AI Security Institute — CareerPair

7h ago

Research Engineer/Research Scientist – Model Transparency

London, UK

✨ $100k-$200k / yearest.

full-timeai-ml

💼 About This Role

You'll join the Model Transparency team at the AI Security Institute, researching how oversight of frontier AI systems can remain reliable as models become less transparent. You'll develop methods to detect and measure risks like evaluation awareness and unfaithful reasoning, influencing both AI companies and government policy. This role offers direct impact on global AI safety.

🎯 What You'll Do

Design and run experiments on open-weight models to study alignment phenomena
Develop chain-of-thought monitorability benchmarks for frontier systems
Build tooling and infrastructure for agent orchestration and RL pipelines
Review frontier lab risk assessments and safety cases

📋 Requirements

Get-things-done mindset with ownership and fast execution
Self-sufficiency and teamwork, defining own agenda while contributing to shared goals
Ability to build and orchestrate AI agents for effective task completion
Demonstrated track record of relevant high-quality work (publications, blog posts, etc.)

✨ Nice to Have

Experience with interpretability methods like sparse autoencoders or probes
Background in capability or alignment evaluations
Prior work collaborating with frontier AI labs

🎁 Benefits & Perks

🔬 Work with world-leading AI safety researchers from Anthropic, OpenAI, DeepMind
🏛️ Direct lines to UK government and No. 10, influencing policy
🌍 International collaboration with allied governments and frontier labs
🚀 Unique resources and agility to shape AI development and government action

📨 Hiring Process

Estimated timeline: 2-4 weeks · AI estimate

1Recruiter Call· 30 min
2Technical Interview· 60 min
3Team Fit Interview· 45 min

Jobs at AI Security Institute

Other jobs at Jobs at AI Security Institute

No other jobs found.

0 0 0