1h ago
Research Engineer, Infrastructure, Tinker
San Francisco
$350k-$475k / year
full-timeseniorArtificial Intelligence Visa Sponsor
🛠 Tech Stack
💼 About This Role
You'll design and scale the infrastructure behind Tinker to enable seamless model fine-tuning for internal teams and external customers. Your work will optimize GPU utilization, multi-tenant scheduling, and developer-friendly APIs, ensuring users can focus on research without infrastructure concerns. This role sits at the intersection of large-scale training systems and product infrastructure, with opportunities to publish and open-source your learnings.
🎯 What You'll Do
- Design and implement distributed job orchestration for multi-tenant workloads.
- Optimize GPU utilization, throughput, and reliability across clusters.
- Develop reusable frameworks to improve transparency, reproducibility, and performance.
- Co-design fine-tuning challenges into product features with researchers and engineers.
📋 Requirements
- Bachelor's degree or equivalent in computer science or related field.
- Understanding of deep learning frameworks like PyTorch or JAX.
- Thrive in a highly collaborative environment with cross-functional partners.
- Strong engineering skills with ability to debug complex codebases.
✨ Nice to Have
- Hands-on experience with container orchestration for GPU workloads.
- Background in multi-tenant platform design and storage systems for ML artifacts.
- Contributions to ML systems OSS like PyTorch/DeepSpeed/XLA.
🎁 Benefits & Perks
- 🏥 Health, dental, and vision insurance
- 🏖️ Unlimited PTO
- 👶 Paid parental leave
- 🚚 Relocation support
- 💰 Generous compensation
📨 Hiring Process
This is an evergreen role; applications reviewed continuously, and you may reapply after 6 months.
🚩 Heads Up
- Evergreen role with no immediate opening may lead to delayed response.
- High experience level expected given senior salary range.
- Vague requirement: 'thrive in a highly collaborative environment' lacks measurable criteria.
0 0 0