22h ago
Senior Software Engineer, Data
Seattle, WA
$126k-$189k / year
full-timesenior Hybridai-ml
๐ Tech Stack
๐ผ About This Role
You'll build the data infrastructure behind AI research agents that explore scholarly literature, owning pipelines, designing schemas, and shipping production services for the Semantic Scholar corpus while applying ML techniques like entity resolution and text classification to improve data quality at scale.
๐ฏ What You'll Do
- Improve coverage and quality of the Semantic Scholar corpus
- Build and maintain scalable data pipelines for corpus integration
- Develop and deploy ML models for entity disambiguation and author linking
- Design and extend APIs that expose structured scholarly data
- Contribute to dashboards and tools for evaluating data quality
๐ Requirements
- 8+ years of technical experience
- Strong Python engineering skills for data pipelines
- Experience with SQL and schema design in production settings
- Familiarity with ML workflows for large-scale structured data
โจ Nice to Have
- Experience with author disambiguation or entity resolution
- Experience applying vector-based similarity or topic modeling
- Exposure to citation networks or scholarly data systems
๐ Benefits & Perks
- ๐ฐ Competitive base salary with bonus plans
- ๐๏ธ Generous paid vacation and sick leave
- ๐ถ Family leave
- ๐ Learning organization with weekly Academy lectures
- ๐ค Collaborative and transparent culture
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter Callยท 30 min
- 2Technical Phone Screenยท 60 min
- 3Onsite Interviewsยท 4 hours
0 0 0