1d ago

Software Engineer, Agent Dev Velocity

San Francisco, CA

$214k-$300k / year

full-time Hybridsoftware

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll build the evaluation infrastructure that helps Notion ship high-quality AI faster and more safely. You'll work at the intersection of developer tooling, distributed systems, and measurement to create scalable eval runners and durable benchmarks that keep teams honest about quality over time.

๐ŸŽฏ What You'll Do

  • Build and improve scalable eval runners and harnesses.
  • Create tools for adding high-signal evals (templates, fixtures, debugging).
  • Maintain benchmark and dataset tooling (curation, versioning, regression tracking).
  • Improve reliability and observability for eval execution.

๐Ÿ“‹ Requirements

  • Strong software engineering fundamentals and production systems experience.
  • Proficiency with TypeScript/Node and/or Python.
  • Experience building reliable systems in distributed environments (queues, retries, idempotency).
  • Comfort working with data pipelines (batch processing, versioning, reproducibility).

โœจ Nice to Have

  • Experience building developer tooling (CLI, CI integrations, internal platforms).
  • Familiarity with LLM evaluation techniques (rubrics, human review, dataset curation).
  • Experience collaborating across teams to drive adoption.

๐ŸŽ Benefits & Perks

  • ๐Ÿ’ฐ Competitive cash compensation
  • ๐Ÿ“ˆ Equity
  • ๐Ÿ–๏ธ Flexible PTO
  • ๐Ÿฉบ Health insurance

๐Ÿ“จ Hiring Process

Estimated timeline: 3-5 weeks ยท AI estimate

  1. 1Recruiter Callยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3Onsite Interviewsยท 4 hours
0 0 0