ML/Research Engineer, Safeguards at Jobs at Anthropic

6h ago

ML/Research Engineer, Safeguards

San Francisco, CA; New York City, NY

$350,000-$500,000 / year

full-timeseniorArtificial Intelligence Visa Sponsor

Tech Stack

Description

You will build systems to detect and mitigate misuse of AI systems, including developing classifiers for harmful behavior, monitoring multi-exchange attacks, improving agentic product safety, and conducting research on red-teaming and adversarial robustness.

Requirements

4+ years experience in ML engineering, research engineering, or applied research
Proficiency in Python and experience building ML systems
Comfortable working across research-to-deployment pipeline
Strong communication skills to explain complex technical concepts to non-technical stakeholders
Bachelor's degree in relevant field or equivalent

Responsibilities

Develop classifiers to detect misuse and anomalous behavior at scale, including synthetic data pipelines and representative evaluations
Build systems to monitor harms spanning multiple exchanges (e.g., coordinated cyber attacks, influence operations)
Evaluate and improve safety of agentic products, including threat models and mitigations for prompt injection
Conduct research on automated red-teaming and adversarial robustness

Jobs at Anthropic

Other jobs at Jobs at Anthropic

No other jobs found.

0 views 0 saves 0 applications