Develop and maintain scalable data pipelines using Python and PySpark .
Work on Databricks to build data workflows and manage big data transformations.
Write complex SQL queries for data analysis and transformation.
Optimize performance for large datasets across distributed computing environments.
Collaborate with cross-functional teams to understand data requirements and implement robust solutions.
Debug and troubleshoot data pipeline and processing issues.
Leverage cloud services (preferably Azure or GCP) to support data storage, processing, and orchestration.
Required Skills:
Proficiency in Python, PySpark , Databricks, and SQL.
Hands-on experience in performance tuning for large datasets.
Strong debugging and analytical problem-solving skills.
Basic knowledge of cloud platforms, preferably Azure or Google Cloud Platform (GCP).
Good to Have:
Experience with BigQuery .
Keyskills: Telecom Data analysis Manager Quality Assurance Consulting Manager Technology Healthcare Business intelligence Ruby SQL Python