1d ago

Software Engineer - Fleet

San Francisco, CA | Bellevue, WA

$203k-$300k / year

full-timemid Hybridai-ml

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll develop and maintain production systems for GPU fleet lifecycle management at Lambda, the leader in AI cloud infrastructure. Your work will automate provisioning and enable new hardware introduction, scaling infrastructure that powers superintelligence compute for tens of thousands of customers.

๐ŸŽฏ What You'll Do

  • Design and implement software for GPU fleet lifecycle management
  • Build automation frameworks for machine provisioning and configuration
  • Enable bring-up and validation for new server and accelerator platforms
  • Debug hardware and firmware issues including BIOS and BMC

๐Ÿ“‹ Requirements

  • 2+ years with Go or Python in production environments
  • 2+ years with configuration management tools and practices
  • Comfortable working in Linux environments and debugging OS/hardware layers
  • Ability to independently troubleshoot complex systems

โœจ Nice to Have

  • Experience with Go in infrastructure or backend development
  • Hands-on experience with bare metal provisioning (Redfish, BMC, IPMI)
  • Experience diagnosing driver, firmware, and hardware compatibility on GPU servers

๐ŸŽ Benefits & Perks

  • ๐Ÿ–๏ธ Flexible PTO that we all actually use
  • ๐Ÿ’ฐ Generous cash & equity compensation
  • ๐Ÿฅ Health, dental, and vision coverage for you and dependents
  • ๐Ÿ‹๏ธ Wellness and commuter stipends for select roles
  • ๐Ÿ“ˆ 401k Plan with 2% company match (US employees)

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Callยท 30 min
  2. 2Technical Screenยท 60 min
  3. 3Onsite Interviewsยท full day
0 0 0