1d ago
Software Engineer, Agent Dev Velocity
San Francisco, CA
$214k-$300k / year
full-time Hybridsoftware
๐ Tech Stack
๐ผ About This Role
You'll build the evaluation infrastructure that helps Notion ship high-quality AI faster and more safely. You'll work at the intersection of developer tooling, distributed systems, and measurement to create scalable eval runners and durable benchmarks that keep teams honest about quality over time.
๐ฏ What You'll Do
- Build and improve scalable eval runners and harnesses.
- Create tools for adding high-signal evals (templates, fixtures, debugging).
- Maintain benchmark and dataset tooling (curation, versioning, regression tracking).
- Improve reliability and observability for eval execution.
๐ Requirements
- Strong software engineering fundamentals and production systems experience.
- Proficiency with TypeScript/Node and/or Python.
- Experience building reliable systems in distributed environments (queues, retries, idempotency).
- Comfort working with data pipelines (batch processing, versioning, reproducibility).
โจ Nice to Have
- Experience building developer tooling (CLI, CI integrations, internal platforms).
- Familiarity with LLM evaluation techniques (rubrics, human review, dataset curation).
- Experience collaborating across teams to drive adoption.
๐ Benefits & Perks
- ๐ฐ Competitive cash compensation
- ๐ Equity
- ๐๏ธ Flexible PTO
- ๐ฉบ Health insurance
๐จ Hiring Process
Estimated timeline: 3-5 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Interviewยท 60 min
- 3Onsite Interviewsยท 4 hours
0 0 0