Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Senior Data Engineer (Data Lake, Forecasting & Governance) @ Leewayhertz

Home > Software Development

 Senior Data Engineer (Data Lake, Forecasting & Governance)

Job Description


We are looking for an experienced Senior Data Engineer to lead the development of scalable AWS-native data lake pipelines, with a strong focus on time series forecasting, upsert-ready architectures, and enterprise-grade data governance. This role demands end-to-end ownership of the data lifecyclefrom ingestion to partitioning, versioning, QA, lineage tracking, and BI delivery.

The ideal candidate will be highly proficient in AWS data services, PySpark, and versioned storage formats such as Apache Hudi or Iceberg. A strong understanding of data quality, observability, governance, and metadata management in large-scale analytical systems is critical.


Roles & Responsibilities

  • Design and implement data lake zoning (Raw Clean Modeled) using Amazon S3, AWS Glue, and Athena.
  • Ingest structured and unstructured datasets including POS, USDA, Circana, and internal sales data.
  • Build versioned and upsert-ready ETL pipelines using Apache Hudi or Iceberg.
  • Create forecast-ready datasets with lagged, rolling, and trend features for revenue and occupancy modeling.
  • Optimize Athena datasets with partitioning, CTAS queries, and S3 metadata tagging.
  • Implement S3 lifecycle policies, intelligent file partitioning, and audit logging for performance and compliance.
  • Build reusable transformation logic using dbt-core or PySpark to support KPIs and time series outputs.
  • Integrate data quality frameworks such as Great Expectations, custom logs, and AWS CloudWatch for field-level validation and anomaly detection.
  • Apply data governance practices using tools like OpenMetadata or Atlan, enabling lineage tracking, data cataloging, and impact analysis.
  • Establish QA automation frameworks for pipeline validation, data regression testing, and UAT handoff.
  • Collaborate with BI, QA, and business teams to finalize schema design and deliverables for dashboard consumption.
  • Ensure compliance with enterprise data governance policies and enable discovery and collaboration through metadata platforms.

Preferred Candidate Profile

  • 9-12 years of experience in data engineering.
  • Deep hands-on experience with AWS Glue, Athena, S3, Step Functions, and Glue, Data Catalog.
  • Strong command over PySpark, dbt-core, CTAS query optimization, and advanced partition strategies.
  • Proven experience with versioned ingestion using Apache Hudi, Iceberg, or Delta Lake.
  • Experience in data lineage, metadata tagging, and governance tooling using OpenMetadata, Atlan, or similar platforms.
  • Proficiency in feature engineering for time series forecasting (lags, rolling windows, trends).
  • Expertise in Git-based workflows, CI/CD, and deployment automation (Bitbucket or similar).
  • Strong understanding of time series KPIs: revenue forecasts, occupancy trends, demand volatility, etc.
  • Knowledge of statistical forecasting frameworks (e.g., Prophet, GluonTS, Scikit-learn).
  • Experience with Superset or Streamlit for QA visualization and UAT testing.
  • Experience building data QA frameworks and embedding data validation checks at each stage of the ETL lifecycle.
  • Independent thinker capable of designing systems that scale with evolving business logic and compliance requirements.
  • Excellent communication skills for collaboration with BI, QA, data governance, and business stakeholders.
  • High attention to detail, especially around data accuracy, documentation, traceability, and auditability.

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time

Contact Details:

Company: Leewayhertz
Location(s): Noida, Gurugram

+ View Contactajax loader


Keyskills:   Governance Forecasting Data Lake Open Metadata Data Engineering Lineage Great Expectations Atlan Step Functions QA Frameworks Glue iceberg AWS Athena

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Engineer /senior Engineer - (mcu Rtos)

  • Einfochips
  • 5 - 10 years
  • Hyderabad
  • 1 day ago
₹ Not Disclosed

Solutions Engineer - Linux BSP

  • Einfochips
  • 5 - 10 years
  • Noida, Gurugram
  • 2 days ago
₹ Not Disclosed

QA Automation & Infrastructure Engineer

  • FCS Software Solutions
  • 10 - 20 years
  • Noida, Gurugram
  • 3 days ago
₹ Not Disclosed

Senior Principal Technical Consultant

  • Oracle
  • 14 - 17 years
  • Hyderabad
  • 3 days ago
₹ Not Disclosed

Leewayhertz

LeewayHertz is a leading software development company delivering tailor-made digital solutions to businesses worldwide. Our team of 250+ full-stack developers, designers and innovators has designed and developed 100+ digital solutions across industry verticals. As a close-knit team of AI and web3 ex...