19h ago
Member of Technical Staff, Synthetic Data
Toronto
✨ $200k-$350k / yearest.
full-timesenior Remoteai-ml
🛠 Tech Stack
💼 About This Role
You'll build inference pipelines for large GPU clusters and synthetic data generation at Cohere, a frontier AI company. You'll directly improve model training metrics like throughput and token efficiency. This role offers the opportunity to work at the cutting edge of AI research in a remote-friendly environment.
🎯 What You'll Do
- Design and build scalable inference pipelines on large GPU clusters.
- Conduct data ablations to assess data quality and model performance.
- Research and implement innovative synthetic data curation methods.
- Collaborate with cross-functional teams on data pipeline requirements.
📋 Requirements
- Strong software engineering with Python proficiency.
- Experience with data processing frameworks like Apache Spark.
- Experience working with LLMs through projects or experimentation.
- Familiarity with large-scale datasets (web, code, multilingual corpora).
✨ Nice to Have
- Experience with LLM inference frameworks like vLLM or TensorRT.
- Paper at top-tier venues (NeurIPS, ICML, ICLR, etc.).
🎁 Benefits & Perks
- 🤝 Open and inclusive culture
- 🧑💻 Work on cutting-edge AI research
- 🍽 Weekly lunch stipend
- 🦷 Full health and dental benefits
- ✈️ 6 weeks of vacation
0 0 0