Senior Software Engineer - Model Performance
You'll make our inference stack as fast and efficient as possible, working from **CUDA kernels** to **serving frameworks** to eliminate bottlenecks. Your north star is inference performance: latency, throughput, and cost efficiency. You'll have autonomy, a large compute budget, and technical support to push limits.
You'll **own the entire video content strategy** from ideation to final cut, crafting stories that capture the magic of **building the world's largest distributed GPU cluster**. You'll be embedded with our engineering team to extract authentic narratives from complex technical work.
Applied Machine Learning Engineer
You'll build and improve core ML systems for training specialized AI models at a well-funded startup. Your work will directly shape **custom model quality** at scale, measured by frontier performance and efficiency. You'll own the full training lifecycle from data intake to shipped results.
Hybrid|Junior|Full-time|Ai-ml
Machine Learning Researcher
You'll push the boundaries of **LLM post-training** by exploring new architectures and training techniques. Your work will directly impact **real customer products** at a fast-paced startup. Enjoy **large compute budgets** and autonomy in a collaborative team.
Fullstack Engineer - Frontend Focus
You'll own the user experience for a **globally distributed LLM inference platform**, building dashboards and customer-facing apps. Your work will directly impact how users observe, configure, and pay for inference at scale. This role offers **significant early-stage equity** and the chance to collaborate with **distributed-systems engineers** in a high-agency team.
Hybrid|Senior|Full-time|Ai-ml