3h ago

Senior Software Engineer, Inference Platform

Palo Alto
full-timesenior Hybriddatabase/cloud software

Tech Stack

Description

You will design and build components of a multi-tenant inference platform integrated with MongoDB Atlas, supporting semantic search and hybrid retrieval. Collaborate with AI engineers and researchers to productionize inference for embedding models and rerankers, and improve performance and resource efficiency in a cloud-native environment.

Requirements

  • 5+ years building backend or infrastructure systems at scale
  • Strong software engineering in Go, Rust, Python, or C++ with emphasis on performance and reliability
  • Experience with cloud-native architectures, distributed systems, and multi-tenant service design
  • Familiar with concepts in ML model serving and inference runtimes
  • Comfortable working across functional teams including ML researchers and platform teams

Responsibilities

  • Design and build components of a multi-tenant inference platform integrated with MongoDB Atlas
  • Collaborate with AI engineers and researchers to productionize inference for embedding models and rerankers
  • Contribute to platform capabilities such as latency-aware routing, model versioning, and observability
  • Improve performance, autoscaling, GPU utilization, and resource efficiency in a cloud-native environment
  • Work across product, infrastructure, and ML teams to meet scale, reliability, and latency demands
0 views 0 saves 0 applications