18h ago

Senior Software Engineer, Observability

San Francisco, CA - US

$172k-$209k / year

full-timeseniorai-ml

🛠 Tech Stack

💼 About This Role

You'll design and build distributed systems that process massive volumes of real-time telemetry data for Crusoe's cloud infrastructure. Your work will enable engineers to understand system behavior and troubleshoot issues faster, operating large-scale infrastructure with confidence. This role offers a unique opportunity to impact AI infrastructure built from the ground up.

🎯 What You'll Do

  • Maintain core observability tools for metrics, events, logs, and tracing.
  • Develop data pipelines to move telemetry data to backend storage.
  • Manage large-scale data ingestion and storage for high-volume environments.
  • Participate in on-call rotation to address production issues.

📋 Requirements

  • 5+ years of experience in software or systems engineering.
  • Proficiency in Java, Go, or Python for production-level code.
  • Practical experience managing Kubernetes clusters in production.
  • Experience deploying services with Helm and YAML-based configurations.

✨ Nice to Have

  • Experience with Prometheus, Grafana, Loki, ClickHouse, or Elasticsearch.
  • Familiarity with Kafka or similar message queuing systems.
  • Experience using Terraform for infrastructure provisioning.

🎁 Benefits & Perks

  • 💰 Competitive pay
  • 📈 Restricted Stock Units
  • 🏥 Health insurance (HDHP, PPO, vision, dental)
  • 👶 Paid Parental Leave
  • 🏖️ Generous PTO
0 0 0