Key Responsibilities:
Design and implement automated test strategies for validating ETL pipelines and data integration workflows.
Develop automated test scripts using industry-standard tools/frameworks to validate data ingestion, transformation, and delivery.
Test data flows involving Kafka topics, ensuring data consistency and schema validation across producers/consumers.
Validate data correctness between source systems (DB2) and downstream targets (e.g., S3, Redshift, etc.).
Perform regression, performance, and functional testing of data-centric applications and APIs.
Build and maintain reusable automation test cases and frameworks aligned with DevOps and CI/CD.
Collaborate closely with ETL developers, data engineers, and business analysts to define test coverage and quality metrics.
Analyze test results, debug issues, and proactively identify data anomalies.
Manage test data and leverage mocks/stubs for downstream systems where applicable.
Required Skills & Experience:
6+ years of experience in QA Engineering, with a strong focus on ETL/data warehouse testing.
Hands-on experience with Apache Kafka (topic validation, offset tracking, schema evolution).
Proficient in validating data pipelines on AWS (S3, Lambda, Glue, Redshift, or EMR).
Strong database testing skills with DB2 and SQL-based validation across large datasets.
Solid experience with automation tools/frameworks like Pytest, JUnit, TestNG, Selenium, or custom scripting in Python.
Experience with CI/CD tools (e.g., Jenkins, GitHub Actions) and version control (Git).
Understanding of data quality, lineage, and metadata validation principles.
Strong communication skills and ability to work across agile squads.
Nice to Have:
Knowledge of data lake architecture and cloud-native ETL orchestration.
Experience using Docker or Kubernetes for test environments.
Experience testing REST APIs and microservices supporting data flows.

Keyskills: sql automation tools data warehouse testing etl database testing continuous integration kubernetes glue amazon redshift ci/cd dbms emr docker microservices git selenium jenkins rest python pytest lambda expressions kafka qa engineering aws testng