Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Python Data Engineer @ CGI

Home > Software Development






 Python Data Engineer

Job Description

Sr. FullStack Engineer Python Expertise (focused primarily on Data Engineering, ELT/ETL):+

Experience:8+

Location: Chennai, Bengaluru and Hyderabad


Responsibilities & Qualifications

  • Strong hands-on experience in Python, leveraging libraries such as PySpark, Scikit-learn, TensorFlow, PyTorch, Pandas, PyArrow, and related tools for large-scale data transformation, processing, and automation.
  • Design, develop, and optimize distributed data pipelines using Apache Spark (PySpark).
  • Deploy and manage Spark workloads on AWS EMR (including cluster sizing, autoscaling, performance tuning).
  • Experience implementing centralized data governance and analytics solutions on AWS, including AWS Glue Data Catalog, EMR, Athena, PySpark, and Glue Jobs.
  • Develop ETL/ELT workflows using AWS Glue (Glue Jobs, Crawlers, Data Catalog).
  • Orchestrate data workflows using Step Functions / Airflow / Glue Workflows.
  • Strong understanding of business-layer modeling, data pipelines, data architecture, and both real-time and batch data processing frameworks.
  • Hands-on experience with AWS cloud services, including but not limited to:
    • EC2, S3, CloudFront, API Gateway, Lambda
    • RDS / PostgreSQL
    • IAM roles and access policies
    • Kubernetes (EKS, Helm)
  • Familiarity with DevOps tools such as Git, GitLab, CloudFormation, and Terraform is an added advantage.
  • Experience building and managing containerized applications using open-source technologies and Kubernetes.
  • Experience supporting data processing and enrichment platforms, including event streaming systems.
  • Experience developing component-based Single Page Applications (SPA) using Angular with TypeScript or comparable JavaScript frameworks.

Required Skills

  • Minimum 5+ years of hands-on experience with PySpark and AWS Glue.
  • Experience managing and optimizing AWS EMR clusters.
  • Strong proficiency in SQL.
  • Practical experience with Scikit-learn, TensorFlow, or PyTorch.

Good to Have

  • Experience with Docker and containerization technologies.
  • Experience building Kafka-based streaming pipelines.
  • Exposure to modern data lake technologies such as Delta Lake, Apache Iceberg, or Apache Hudi.

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time

Contact Details:

Company: CGI
Location(s): Hyderabad

+ View Contactajax loader


Keyskills:   Pyspark EMR Aws Glue ETL Python

 Fraud Alert to job seekers!

₹ 16-25 Lacs P.A

Similar positions

Azure Databricks - 12th May (Tuesday) - Virtual Interview

  • Tata Consultancy
  • 4 - 9 years
  • Noida, Gurugram
  • 2 days ago
₹ Not Disclosed

Hadoop, Spark, Scala Data Engineer

  • Tata Consultancy
  • 5 - 7 years
  • Hyderabad
  • 2 days ago
₹ 15-22.5 Lacs P.A.

SF Playwright Python Automation

  • Criticalriver
  • 4 - 9 years
  • India
  • 2 days ago
₹ 15-30 Lacs P.A.

Azure Databricks - 2nd April (Thursday) - Virtual Interview -Pan India

  • Tata Consultancy
  • 5 - 10 years
  • India
  • 2 days ago
₹ Not Disclosed

CGI

CGI Information Systems and Management Consultants