about 3 hours ago

Senior ML Data Engineer

Cincinnati, OH; Chicago, IL

$97,000-$166,750 / year

full-timesenior Remoteretail data science, insights, and media

Tech Stack

Description

As a Senior ML Data Engineer on the Relevancy Sciences team, you will architect and build the data infrastructure powering machine learning models for Kroger's e-commerce platform. You will own feature stores, training pipelines, and ML data operations, enabling data scientists to iterate rapidly with production-grade reliability.

Requirements

  • 3+ years of hands-on experience building and maintaining ML data pipelines in production environments.
  • Expert-level SQL skills and advanced Python programming with data processing frameworks and ML libraries.
  • Proven experience with GCP ecosystem (BigQuery, Dataflow, Vertex AI Feature Store).
  • Deep understanding of end-to-end ML workflows including training data preparation, model evaluation, and serving.
  • Production operations mindset with monitoring, alerting, on-call, and SLA commitment.
  • Strongly preferred: hands-on experience with Feature Store platforms (Vertex AI, Feast, Tecton).
  • Strongly preferred: knowledge of point-in-time correctness, temporal joins, time-series data modeling.
  • Strongly preferred: multi-cloud experience with Azure and GCP.
  • Strongly preferred: familiarity with core ML concepts and background in analytics engineering and ML data engineering.

Responsibilities

  • Own the feature request lifecycle from intake through deployment, designing scalable feature pipelines from BigQuery and Azure Data Lake, writing to Vertex AI Feature Store and BigQuery.
  • Build streaming feature engineering pipelines using Apache Beam/Dataflow for real-time computation and low-latency model serving.
  • Ensure point-in-time correctness and online/offline feature consistency to prevent data leakage.
  • Implement drift detection, data quality monitoring, and alerting mechanisms.
  • Develop self-service tools and templates for independent feature creation.
  • Build automated training/evaluation data pipelines with point-in-time correctness and sampling strategies.
  • Maintain comprehensive dataset versioning for full traceability.
  • Serve as Tier 2/3 on-call responder for feature data quality incidents.
  • Maintain lineage tracking and metadata management for data traceability.
  • Support regulatory compliance through data governance and documentation.
  • Establish and enforce feature naming conventions, data quality thresholds, and point-in-time correctness patterns.
  • Conduct workshops on feature engineering best practices.
  • Partner with Data Scientists, ML Engineers, Data Engineering, and MLOps teams.
0 views 0 saves 0 applications