Software Engineer, Safeguards at Jobs at Anthropic

3h ago

Software Engineer, Safeguards

San Francisco, CA | New York City, NY

$320,000-$425,000 / year

full-timeseniorartificial intelligence Visa Sponsor

Tech Stack

Description

You will build safety and oversight mechanisms for AI systems, focusing on detecting unwanted model behaviors and preventing misuse. You'll develop monitoring systems, abuse detection infrastructure, and multi-layered defenses to ensure user well-being and enforce acceptable use policies.

Requirements

Bachelor's degree in Computer Science, Software Engineering or comparable experience
5-10+ years of software engineering experience, preferably in integrity, spam, fraud, or abuse detection
Proficiency in Python and TypeScript
Ability to work across the stack
Strong communication skills to explain complex technical concepts to non-technical stakeholders

Responsibilities

Develop monitoring systems to detect unwanted behaviors from API partners and trigger automated enforcement or manual review
Build abuse detection mechanisms and infrastructure
Surface abuse patterns to research teams to harden models at training stage
Build robust, multi-layered defenses for real-time improvement of safety mechanisms at scale

Jobs at Anthropic

Other jobs at Jobs at Anthropic

No other jobs found.

0 views 0 saves 0 applications