14h ago

Staff Software Engineer - AI Research Infrastructure

New York City, New York | San Francisco, California

$199k-$270k / year

full-timeseniorsoftware

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll design and build infrastructure to power large-scale AI experiments across thousands of GPUs at Databricks AI Research. You'll partner with research scientists to turn experimental workloads into robust pipelines and push the limits of what our infrastructure can support.

๐ŸŽฏ What You'll Do

  • Design and implement infrastructure for large-scale experiments and model training
  • Build job submission, scheduling, and monitoring abstractions
  • Create tooling for experiment management and CI/testing for research code
  • Influence long-term roadmap for research computation

๐Ÿ“‹ Requirements

  • BS/MS or PhD in Computer Science or related field
  • 5+ years of software engineering experience including large-scale distributed systems
  • Deep experience with distributed systems and infrastructure (GPUs, clusters, cloud)
  • Proficient in systems languages (C++, Rust, Go, Java, Scala)

โœจ Nice to Have

  • Experience with cluster schedulers or job orchestration (Kubernetes, Slurm, Ray)
  • Understanding of modern ML training and inference workflows
  • Experience driving complex systems from prototype to stable service

๐ŸŽ Benefits & Perks

  • ๐Ÿ“ˆ Annual Performance Bonus
  • ๐Ÿ“Š Equity
  • ๐Ÿ–๏ธ Comprehensive Benefits per region

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Phone Screenยท 60 min
  3. 3Onsite Interviewsยท 4 hours
0 0 0