Key Responsibilities
1. Data Quality Development & Monitoring
-
Design and implement automated data quality rules and validation checks using Databricks (Delta Lake) and PySpark .
-
Build and operationalize data quality workflows in Ataccama ONE / Ataccama Studio .
-
Perform data profiling, anomaly detection, and reconciliation across systems and data sources.
-
Establish thresholds, KPIs, and alerts for data quality metrics.
2. Root Cause Analysis & Issue Management
-
Investigate data anomalies and quality incidents using SQL, Python, and Ataccama diagnostics.
-
Collaborate with data engineers and business analysts to identify and remediate root causes.
-
Document recurring data issues and contribute to preventive automation solutions.
3. Collaboration & Governance Support
-
Partner with data stewards, governance, and analytics teams to define and maintain DQ rules and SLAs.
-
Contribute to metadata enrichment, lineage documentation, and data catalog integration.
-
Support adoption of DQ frameworks and promote data reliability best practices.
4. Automation & Continuous Improvement
-
Integrate DQ validations into orchestration tools (Airflow, Databricks Workflows, or ADF).
-
Leverage Python/Pyspark libraries to complement existing platforms.
-
Propose process improvements to enhance automation, monitoring, and exception management.
Core Technical Skills
Category
Tools / Skills
Data Engineering & Quality
Databricks (Delta Lake), PySpark, SQL, Python
DQ Platforms
Ataccama ONE / Studio (DQ rules, workflows, profiling)
Orchestration
Apache Airflow, Azure Data Factory, or Databricks Jobs
Data Warehouses
Databricks Lakehouse
Version Control / CI-CD
Git, GitHub Actions, Azure DevOps
Data Catalog / Lineage (Optional)
Collibra, Alation, Ataccama Catalog
Cloud Environments
Azure (preferred), AWS, or GCP
Qualifications & Experience
-
Bachelor s degree in Computer Science, Information Systems, Statistics, or related field.
-
6 9 years of experience in data quality, data engineering, or analytics operations.
-
Strong command of SQL , Python , and PySpark for data validation and troubleshooting.
-
Proven experience with Ataccama DQ rule creation and monitoring .
-
Hands-on exposure to Databricks for building and running data pipelines.
-
Working knowledge of reconciliation processes, data profiling, and DQ metrics.
Soft Skills & Attributes
-
Analytical thinker with strong problem-solving abilities.
-
Detail-oriented and methodical approach to troubleshooting.
-
Strong communication skills for cross-functional collaboration.
-
Proactive mindset, capable of owning issues through resolution.
-
Comfortable balancing hands-on technical work with business stakeholder interaction.
Preferred / Nice to Have
-
Exposure to data governance frameworks or MDM initiatives.
-
Familiarity with observability tools (Grafana, Datadog, Prometheus).
-
Understanding of CI/CD practices for data quality deployment.
-
Certification in Databricks, Ataccama, or a major cloud platform (Azure/AWS).
Success Indicators
-
Increase in automated data quality coverage across critical datasets.
-
Reduction in recurring manual DQ exceptions.
-
Improved timeliness and accuracy of data available for analytics.
-
Positive stakeholder feedback on data trust and reliability.