Senior Software Engineer — AI Evaluation & Benchmarks at Jobgether — CareerPair

16h ago

Senior Software Engineer — AI Evaluation & Benchmarks

$166.4k-$208k / year

full-timesenior Remoteai-ml

🛠 Tech Stack

💼 About This Role

You'll design and build coding benchmarks that evaluate frontier AI models on real-world software engineering tasks. Your work directly influences how next-generation models are trained and improved. This role sits at the intersection of software engineering and AI research, where you'll develop scalable systems to run evaluations across large codebases.

🎯 What You'll Do

Design and build coding benchmarks for frontier AI models
Develop scalable evaluation pipelines and data infrastructure
Analyze AI-generated code for correctness and performance issues
Contribute to design and evolution of evaluation methodologies

📋 Requirements

4+ years of professional software engineering experience
Expert-level Python development skills
Experience with large, complex, production-grade codebases
Experience building or contributing to LLM evaluation systems

✨ Nice to Have

Familiarity with JavaScript, Go, or C++
Background in ML evaluation methodologies
Open-source contributions or security engineering experience

🎁 Benefits & Perks

💰 Competitive hourly compensation ($80-$100/hr)
🌍 Fully remote with global flexibility
📆 Weekly payments via PayPal or Stripe
⏳ Short-term 3-month contract with potential extension
🚀 Work on cutting-edge AI systems

📨 Hiring Process

Estimated timeline: 2-4 weeks · AI estimate

1Recruiter screen· 30 min
2Technical interview· 60 min
3Hiring decision· 1-2 weeks

Jobgether

Job openings at Jobgether

Other jobs at Jobgether

No other jobs found.

0 0 0