Research Engineer / Scientist, Alignment Science at Jobs at Anthropic

3h ago

Research Engineer / Scientist, Alignment Science

London, UK

full-timesenior HybridArtificial Intelligence

Tech Stack

Description

You will design and run machine learning experiments to understand and steer the behavior of powerful AI systems, contributing to AI safety research in areas like AI control and alignment stress-testing. Your work will involve building tooling, evaluating jailbreaks, and collaborating with teams to mitigate risks from advanced AI.

Requirements

Significant software, ML, or research engineering experience
Experience contributing to empirical AI research projects
Familiarity with technical AI safety research
Preference for fast-moving collaborative projects

Responsibilities

Run multi-agent reinforcement learning experiments for AI safety
Build tooling to evaluate effectiveness of LLM-generated jailbreaks
Contribute to research papers, blog posts, and talks
Test robustness of safety techniques by training models to subvert them
Collaborate on safety-relevant projects with teams like Interpretability and Red Team

Jobs at Anthropic

Other jobs at Jobs at Anthropic

No other jobs found.

0 views 0 saves 0 applications