2h ago

Senior Software Engineer, Observability

United States

$130,000-$170,000 / year

full-timeseniorCloud Computing / AI Infrastructure

Tech Stack

Description

You will design, build, and own backend systems that power metrics, monitor large-scale infrastructure, and develop a comprehensive infrastructure maintenance platform. You will evolve metrics pipelines, investigate production incidents, and collaborate with hardware and networking teams to improve reliability.

Requirements

  • 5+ years professional software engineering experience
  • Strong production experience with Python and Go
  • Solid Linux fundamentals and comfort debugging live systems
  • Ability to write reliable, maintainable code and solve complex problems
  • Experience building and operating production systems at scale

Responsibilities

  • Design and build services for deep visibility into server fleets and data center systems
  • Evolve metrics, aggregation, and alerting pipelines with focus on signal quality
  • Design and operate maintenance and remediation systems for fleet-wide changes
  • Investigate production incidents with on-host Linux debugging and drive root-cause fixes
  • Collaborate with hardware, networking, and data center operations teams
0 views 0 saves 0 applications