2h ago

Site Reliability Engineer

New York, NY

$175,000-$225,000 / year

full-timeseniorAI

Tech Stack

Description

You will architect and operate scalable production systems supporting multi-tenant cloud and on-premise deployments, design a real-time distributed execution engine for AI applications, and optimize AI agent architectures. Partner with product and customers to define the roadmap and build new AI experiences.

Requirements

  • 3+ years managing cloud-based production apps with deep knowledge of containers, VMs, caches, task queues, networking, and OS
  • Designed and deployed infrastructure in production at scale with Docker, Kubernetes, ECS/EKS, or Firecracker
  • Strong product sense focused on great user experiences and strategic thinking

Responsibilities

  • Architect and operate scalable production systems supporting multi-tenant cloud and on-premise deployments
  • Design and develop a real-time distributed execution engine for AI applications, workflows, and agents
  • Build, deploy, and optimize AI agent architecture, guardrails, and evals
  • Partner with product and customers to define the roadmap and bring new builder and AI experiences to life
0 views 0 saves 0 applications