4h ago
Member of Technical Staff, AI Engineering
San Francisco
$160k-$270k / year
full-timehealthcare
🛠 Tech Stack
💼 About This Role
You'll build the bridge from "impressive demo" to lights-out production for AI agents that automate healthcare back-office work. You'll own the serving, monitoring, and continuous improvement of LLMs and VLMs in the field, turning Mandolin into a true autopilot where clinicians only handle exceptions. This role partners with top healthcare institutions to make groundbreaking treatments accessible faster.
🎯 What You'll Do
- Deploy and operate LLMs and VLMs for real-time inference using vLLM or SGLang
- Build and maintain HIPAA-compliant ML pipelines end to end
- Design evaluation harnesses and telemetry systems to surface model degradation
- Diagnose and resolve ML workflow bottlenecks across GPU, CPU, and serverless
📋 Requirements
- Production experience deploying and serving LLMs or VLMs with inference runtimes like vLLM or SGLang
- White-box understanding of transformer-based models: tokenization, autoregressive generation, and sampling
- Hands-on experience with document parsing or OCR models
- Ability to debug ML system bottlenecks and reason about IO, memory, and compute tradeoffs
✨ Nice to Have
- Healthcare, claims processing, or complex form-extraction experience
- Familiarity with fine-tuning techniques (LoRA/PEFT) or RAG
- Experience on cloud ML stacks (Vertex AI, SageMaker) or Kubernetes-native ML workflows
0 0 0