16h ago

Senior Research Scientist, Model Evaluation

Toronto

$200k-$300k / yearest.

full-timesenior Hybridai-ml

🛠 Tech Stack

💼 About This Role

You'll create next-generation evaluation methods and infrastructure to measure LLM progress at a company scaling intelligence. Your work will directly shape model capabilities and set the agenda for future AI. You'll collaborate with cross-functional teams to translate model feedback into trustworthy evaluations.

🎯 What You'll Do

  • Create ambitious new evaluation benchmarks for LLMs.
  • Translate model feedback into trustworthy, repeatable evaluations.
  • Conduct research on LLM evaluation methods and efficiency.
  • Build scalable tools for digging into model performance.

📋 Requirements

  • Rapidly build prototypes to demonstrate LLM capabilities.
  • Experience reviewing complex data and LLM outputs for quality.
  • Obsessive about rigorously measuring AI capabilities.
  • Strong software engineering skills.

✨ Nice to Have

  • Experience training LLM judges.
  • Refining LLM-based data synthesis pipelines.
  • Improving evaluation efficiency.

🎁 Benefits & Perks

  • 🤝 Inclusive culture and work environment
  • 🧑‍💻 Work on cutting-edge AI research
  • 🍽 Weekly lunch stipend and in-office meals
  • 🦷 Full health and dental benefits with mental health budget
  • ✈️ 6 weeks of vacation
0 0 0