5h ago

AI Benchmark Engineer

Turkey

$100k-$180k / yearest.

contractsenior Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll build multilingual evaluation tasks for large language models, focusing on terminal-based software challenges. Your work will measure multilingual robustness across encoding and locale edge cases. This is a remote freelance role with flexible hours.

🎯 What You'll Do

  • Design and build benchmark tasks for coding agents
  • Create realistic task environments using native language data
  • Develop robust reference implementations and verifier scripts
  • Calibrate task difficulty across multiple model tiers

📋 Requirements

  • 5+ years of software engineering experience
  • Native Turkish fluency with deep grammar knowledge
  • Strong proficiency in Python, shell scripting, and data processing
  • Experience with terminal/CLI development workflows

✨ Nice to Have

  • Background at leading tech companies or top-tier universities
  • Knowledge of Unicode normalization and locale-dependent conventions
  • Familiarity with coding agents

🎁 Benefits & Perks

  • 🗓️ Flexible schedule as an independent contractor
  • 💰 Competitive rates with prompt payments
  • 🌍 Work on cutting-edge AI and language technology
  • 🤝 Join a global community of language professionals
  • 📚 Access to diverse, innovative projects

📨 Hiring Process

Submit application with CV, complete GenAI assessment, finalize onboarding.

[email protected]

0 0 0