1d ago
Sr. Data Extraction Engineer
Columbia
✨ $140k-$180k / yearest.
full-timesenior Remoteai-ml
🛠 Tech Stack
💼 About This Role
You'll drive end-to-end data extraction workflows across complex websites, ensuring accurate and reliable datasets. You'll leverage internal tools like Apify and OpenRouter alongside custom workflows to enhance data collection. This role is ideal for experienced web scraping professionals looking to unlock Generative AI potential.
🎯 What You'll Do
- Own end-to-end data extraction workflows across complex websites.
- Leverage internal tools (Apify, OpenRouter) alongside custom workflows.
- Ensure reliable extraction from dynamic web sources adapting to site changes.
- Enforce data quality standards through validation checks before delivery.
📋 Requirements
- At least 3 years of relevant experience in data engineering or web scraping.
- Strong experience in Python web scraping including dynamic content and APIs.
- Proven ability to extract data from complex structures and clean datasets.
- Hands-on experience with LLMs and AI frameworks for enhancing automation.
✨ Nice to Have
- Bachelor's or Master’s Degree in Engineering, Computer Science, or related field.
- Portfolio or link to GitHub showcasing relevant projects.
- Experience with proxies and bypassing anti-scraping measures.
🎁 Benefits & Perks
- 🏠 Fully remote work on your schedule with just a laptop and stable internet.
- 🤖 Hands-on experience in a hybrid human-AI environment.
- 💰 Performance-based bonus programs for high-quality work.
📨 Hiring Process
Estimated timeline: 2-4 weeks · AI estimate
- 1Recruiter Call· 30 min
- 2Technical Interview· 60 min
- 3Take-Home Assignment· 2-3 hours
0 0 0