19h ago

Member of Technical Staff, Synthetic Data

Toronto

$200k-$350k / yearest.

full-timesenior Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll build inference pipelines for large GPU clusters and synthetic data generation at Cohere, a frontier AI company. You'll directly improve model training metrics like throughput and token efficiency. This role offers the opportunity to work at the cutting edge of AI research in a remote-friendly environment.

🎯 What You'll Do

  • Design and build scalable inference pipelines on large GPU clusters.
  • Conduct data ablations to assess data quality and model performance.
  • Research and implement innovative synthetic data curation methods.
  • Collaborate with cross-functional teams on data pipeline requirements.

📋 Requirements

  • Strong software engineering with Python proficiency.
  • Experience with data processing frameworks like Apache Spark.
  • Experience working with LLMs through projects or experimentation.
  • Familiarity with large-scale datasets (web, code, multilingual corpora).

✨ Nice to Have

  • Experience with LLM inference frameworks like vLLM or TensorRT.
  • Paper at top-tier venues (NeurIPS, ICML, ICLR, etc.).

🎁 Benefits & Perks

  • 🤝 Open and inclusive culture
  • 🧑‍💻 Work on cutting-edge AI research
  • 🍽 Weekly lunch stipend
  • 🦷 Full health and dental benefits
  • ✈️ 6 weeks of vacation
0 0 0