4h ago

Member of Technical Staff, AI Engineering

San Francisco

$160k-$270k / year

full-timehealthcare

🛠 Tech Stack

💼 About This Role

You'll build the bridge from "impressive demo" to lights-out production for AI agents that automate healthcare back-office work. You'll own the serving, monitoring, and continuous improvement of LLMs and VLMs in the field, turning Mandolin into a true autopilot where clinicians only handle exceptions. This role partners with top healthcare institutions to make groundbreaking treatments accessible faster.

🎯 What You'll Do

  • Deploy and operate LLMs and VLMs for real-time inference using vLLM or SGLang
  • Build and maintain HIPAA-compliant ML pipelines end to end
  • Design evaluation harnesses and telemetry systems to surface model degradation
  • Diagnose and resolve ML workflow bottlenecks across GPU, CPU, and serverless

📋 Requirements

  • Production experience deploying and serving LLMs or VLMs with inference runtimes like vLLM or SGLang
  • White-box understanding of transformer-based models: tokenization, autoregressive generation, and sampling
  • Hands-on experience with document parsing or OCR models
  • Ability to debug ML system bottlenecks and reason about IO, memory, and compute tradeoffs

✨ Nice to Have

  • Healthcare, claims processing, or complex form-extraction experience
  • Familiarity with fine-tuning techniques (LoRA/PEFT) or RAG
  • Experience on cloud ML stacks (Vertex AI, SageMaker) or Kubernetes-native ML workflows
0 0 0