5h ago
AI Benchmark Engineer
Turkey
✨ $100k-$180k / yearest.
contractsenior Remoteai-ml
🛠 Tech Stack
💼 About This Role
You'll build multilingual evaluation tasks for large language models, focusing on terminal-based software challenges. Your work will measure multilingual robustness across encoding and locale edge cases. This is a remote freelance role with flexible hours.
🎯 What You'll Do
- Design and build benchmark tasks for coding agents
- Create realistic task environments using native language data
- Develop robust reference implementations and verifier scripts
- Calibrate task difficulty across multiple model tiers
📋 Requirements
- 5+ years of software engineering experience
- Native Turkish fluency with deep grammar knowledge
- Strong proficiency in Python, shell scripting, and data processing
- Experience with terminal/CLI development workflows
✨ Nice to Have
- Background at leading tech companies or top-tier universities
- Knowledge of Unicode normalization and locale-dependent conventions
- Familiarity with coding agents
🎁 Benefits & Perks
- 🗓️ Flexible schedule as an independent contractor
- 💰 Competitive rates with prompt payments
- 🌍 Work on cutting-edge AI and language technology
- 🤝 Join a global community of language professionals
- 📚 Access to diverse, innovative projects
📨 Hiring Process
Submit application with CV, complete GenAI assessment, finalize onboarding.
0 0 0