2h ago

Research Scientist/Engineer (Evaluations)

London

$135k-$270k / year

full-timeArtificial Intelligence Visa Sponsor

🛠 Tech Stack

💼 About This Role

You'll run pre-deployment evaluation campaigns on the world's most capable AI systems, partnering with labs like OpenAI, Anthropic, and Google DeepMind. Your core impact is surfacing behavioral patterns in frontier models and building new evaluations for frontier risks. This role offers unique access to unreleased models before anyone else.

🎯 What You'll Do

  • Run pre-deployment evaluation campaigns on the most capable AI systems.
  • Deep dive into AI cognition and surface behavioral patterns.
  • Build new evaluations for frontier risks from design to scale.
  • Work directly with frontier AI developers to inform deployment decisions.

📋 Requirements

  • Strong Python software engineering skills with production experience.
  • Ability to optimize workflows in fast-paced environments.
  • Data analysis skills to extract signal from large, messy datasets.
  • Clear writing and communication for technical and non-technical audiences.

✨ Nice to Have

  • Experience with Inspect evals framework.
  • Experience using different AI models for various tasks.
  • Self-taught background or non-traditional experience.

🎁 Benefits & Perks

  • 💰 Market competitive salary (100k-200k GBP) plus equity.
  • 🏖️ Unlimited vacation and sick leave.
  • 👶 Up to 6 months paid parental leave.
  • 🍽️ Lunch, dinner, and snacks provided on workdays.
  • 📚 $1,000 yearly professional development budget.
0 0 0