Senior Data Engineer at Abacus Insights

about 4 hours ago

Senior Data Engineer

Kathmandu, Nepal

full-timeseniorhealthcare

Tech Stack

Description

As a Senior Data Engineer at Abacus Insights, you will architect and implement high-volume batch and real-time data pipelines in a modern cloud environment, enabling clean, connected healthcare data for GenAI use cases. You will work with Databricks, Snowflake, and AWS to build scalable ingestion frameworks, optimize data models, and enforce engineering best practices while mentoring junior team members.

Requirements

Bachelor’s degree in Computer Science, Computer Engineering, or a closely related technical field, with 5+ years of hands-on experience as a Data Engineer building and operating large-scale, distributed data systems in modern cloud environments.
Proven ability to clearly communicate complex technical concepts and solutions to both technical and non-technical stakeholders.
Expert-level proficiency in Python, SQL, and PySpark, including development of distributed transformations and performance-optimized queries.
Demonstrated experience designing, building, and operating ETL/ELT pipelines using Databricks, Airflow, or similar orchestration and workflow automation tools.
Proven experience architecting or operating large-scale data platforms using DBT, Kafka, Delta Lake, and event-driven or streaming architectures in cloud-native data or platform engineering environments.
Strong working knowledge of AWS data services (S3, SQS, Lambda, Glue, IAM or equivalents), structured and semi-structured data formats (Parquet, ORC, JSON, Avro), schema evolution, and optimization techniques.
Hands-on experience with Terraform and CI/CD pipelines (e.g., GitLab), deep expertise in SQL and compute optimization (partitioning, clustering, Z-Ordering, pruning, caching), and performance tuning on cloud data warehouses such as Snowflake (preferred), BigQuery, or Redshift.

Responsibilities

Architect, design, and implement high-volume batch and real-time data pipelines using PySpark, SparkSQL, Databricks Workflows, and distributed processing frameworks.
Build end-to-end ingestion frameworks integrating Databricks, Snowflake, AWS services (S3, SQS, Lambda), and vendor APIs, ensuring data quality, lineage, and schema evolution.
Design and optimize data models (star/snowflake schemas) and apply performance tuning techniques for analytical workloads on cloud data warehouses.
Translate complex business requirements into detailed technical specifications, reusable engineering components, and implementation artifacts.
Establish and enforce data engineering best practices, including CI/CD for data pipelines, version control, automated testing, orchestration, logging, and observability.
Drive performance and cost optimization through profiling, cluster tuning, partitioning, indexing, caching, and compute optimization across Databricks and Snowflake.
Ensure operational excellence and team growth by producing high-quality documentation, monitoring and troubleshooting production pipelines, performing root-cause analysis, and mentoring junior engineers.

Abacus Insights

Abacus insights builds best-in-class healthcare platforms for our customers. Similarly we are committed to building our company upon a strong foundation of values, which guides our every day choices, actions, and interactions. Apply to join our product, engineering, security and architecture teams.

Other jobs at Abacus Insights

No other jobs found.

0 views 0 saves 0 applications