about 3 hours ago
Senior ML Data Engineer
Cincinnati, OH; Chicago, IL
$97,000-$166,750 / year
full-timesenior Remoteretail data science, insights, and media
Tech Stack
Description
As a Senior ML Data Engineer on the Relevancy Sciences team, you will architect and build the data infrastructure powering machine learning models for Kroger's e-commerce platform. You will own feature stores, training pipelines, and ML data operations, enabling data scientists to iterate rapidly with production-grade reliability.
Requirements
- 3+ years of hands-on experience building and maintaining ML data pipelines in production environments.
- Expert-level SQL skills and advanced Python programming with data processing frameworks and ML libraries.
- Proven experience with GCP ecosystem (BigQuery, Dataflow, Vertex AI Feature Store).
- Deep understanding of end-to-end ML workflows including training data preparation, model evaluation, and serving.
- Production operations mindset with monitoring, alerting, on-call, and SLA commitment.
- Strongly preferred: hands-on experience with Feature Store platforms (Vertex AI, Feast, Tecton).
- Strongly preferred: knowledge of point-in-time correctness, temporal joins, time-series data modeling.
- Strongly preferred: multi-cloud experience with Azure and GCP.
- Strongly preferred: familiarity with core ML concepts and background in analytics engineering and ML data engineering.
Responsibilities
- Own the feature request lifecycle from intake through deployment, designing scalable feature pipelines from BigQuery and Azure Data Lake, writing to Vertex AI Feature Store and BigQuery.
- Build streaming feature engineering pipelines using Apache Beam/Dataflow for real-time computation and low-latency model serving.
- Ensure point-in-time correctness and online/offline feature consistency to prevent data leakage.
- Implement drift detection, data quality monitoring, and alerting mechanisms.
- Develop self-service tools and templates for independent feature creation.
- Build automated training/evaluation data pipelines with point-in-time correctness and sampling strategies.
- Maintain comprehensive dataset versioning for full traceability.
- Serve as Tier 2/3 on-call responder for feature data quality incidents.
- Maintain lineage tracking and metadata management for data traceability.
- Support regulatory compliance through data governance and documentation.
- Establish and enforce feature naming conventions, data quality thresholds, and point-in-time correctness patterns.
- Conduct workshops on feature engineering best practices.
- Partner with Data Scientists, ML Engineers, Data Engineering, and MLOps teams.
0 views 0 saves 0 applications