Senior Pyspark Data Engineer (big Data, Cloud Data Solutions) @ Synechron

Home > Software Development

Senior Pyspark Data Engineer (big Data, Cloud Data Solutions)

Synechron
2 - 7 years
Hyderabad
6 months ago
Email to a friend
Report this job

Job Description

Job SummarySynechron is seeking a skilled PySpark Data Engineer to design, develop, and optimize data processing solutions leveraging modern big data technologies. In this role, you will lead efforts to build scalable data pipelines, support data integration initiatives, and work closely with cross-functional teams to enable data-driven decision-making. Your expertise will contribute to enhancing business insights and operational efficiency, positioning Synechron as a pioneer in adopting emerging data technologies.
Software RequirementsRequired Software Skills:PySpark (Apache Spark with Python) experience in developing data pipelines
Apache Spark ecosystem knowledge
Python programming (versions 3.7 or higher)
SQL and relational database management systems (e.g., PostgreSQL, MySQL)
Cloud platforms (preferably AWS or Azure)
Version control: GIT
Data workflow orchestration tools like Apache Airflow
Data management tools: SQL Developer or equivalentPreferred Software Skills:Experience with Hadoop ecosystem components
Knowledge of containerization (Docker, Kubernetes)
Familiarity with data lake and data warehouse solutions (e.g., AWS S3, Redshift, Snowflake)
Monitoring and logging tools (e.g., Prometheus, Grafana)Overall ResponsibilitiesLead the design and implementation of large-scale data processing solutions using PySpark and related technologies
Collaborate with data scientists, analysts, and business teams to understand data requirements and deliver scalable pipelines
Mentor junior team members on best practices in data engineering and emerging technologies
Evaluate new tools and methodologies to optimize data workflows and improve data quality
Ensure data solutions are robust, scalable, and aligned with organizational data governance policies
Stay informed on industry trends and technological advancements in big data and analytics
Support production environment stability and performance tuning of data pipelines
Drive innovative approaches to extract value from large and complex datasetsTechnical Skills (By Category)Programming Languages:Required: Python (PySpark experience minimum 2 years)
Preferred: Scala (for Spark), SQL, Bash scriptingDatabases/Data Management:Relational databases (PostgreSQL, MySQL)
Distributed storage solutions (HDFS, cloud object storage like S3 or Azure Blob Storage)
Data warehousing platforms (Snowflake, Redshift preferred)Cloud Technologies:Required: Experience deploying and managing data solutions on AWS or Azure
Preferred: Knowledge of cloud-native services like EMR, Data Factory, or Azure Data LakeFrameworks and Libraries:Apache Spark (PySpark)
Airflow or similar orchestration tools
Data processing frameworks (Kafka, Spark Streaming preferred)Development Tools and Methodologies:Version control with GIT
Agile management tools: Jira, Confluence
Continuous integration/deployment pipelines (Jenkins, GitLab CI)Security Protocols:Understanding of data security, access controls, and GDPR compliance in cloud environmentsExperience RequirementsMinimum of 5+ years in data engineering, with hands-on PySpark experience
Proven track record of developing, deploying, and maintaining scalable data pipelines
Experience working with data lakes, data warehouses, and cloud data services
Demonstrated leadership in projects involving big data technologies
Experience mentoring junior team members and collaborating across teams
Prior experience in financial, healthcare, or retail sectors is beneficial but not mandatoryDay-to-Day ActivitiesDevelop, optimize, and deploy big data pipelines using PySpark and related tools
Collaborate with data analysts, data scientists, and business teams to define data requirements
Conduct code reviews, troubleshoot pipeline issues, and optimize performance
Mentor junior team members on best practices and emerging technologies
Design solutions for data ingestion, transformation, and storage
Evaluate new tools and frameworks for continuous improvement
Maintain documentation, monitor system health, and ensure security compliance
Participate in sprint planning, daily stand-ups, and project retrospectives to align prioritiesQualificationsBachelors or Masters degree in Computer Science, Information Technology, or related discipline
Relevant industry certifications (e.g., AWS Data Analytics, GCP Professional Data Engineer) preferred
Proven experience working with PySpark and big data ecosystems
Strong understanding of software development lifecycle and data governance standards
Commitment to continuous learning and professional development in data engineering technologiesProfessional CompetenciesAnalytical mindset and problem-solving acumen for complex data challenges
Effective leadership and team management skills
Excellent communication skills tailored to technical and non-technical audiences
Adaptability in fast-evolving technological landscapes
Strong organizational skills to prioritize tasks and manage multiple projects
Innovation-driven with a passion for leveraging emerging data technologies

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Platform Engineer
Employement Type: Full time

Contact Details:

Company: Synechron
Location(s): Hyderabad

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: PySpark PostgreSQL Big Data Kafka Spark Streaming SQL Jenkins Apache Airflow Git Cloud Data Solutions Bash scripting GitLab CI MySQL Python

Job seems aged, it may have been expired!
Fraud Alert to job seekers!

₹ Not Disclosed

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

CTO - Quantum Engineering - Developer

Wipro

2 - 7 years

Bengaluru

3 days ago

₹ Not Disclosed

Data Engineer (Azure Purview)

Capgemini

6 - 11 years

Hyderabad

3 days ago

₹ Not Disclosed

MLOps Engineer

Capgemini

5 - 10 years

Hyderabad

3 days ago

₹ Not Disclosed

Custom Software Engineer

Accenture

2 - 5 years

Mumbai

4 days ago

₹ Not Disclosed

Synechron

We are a digital solutions and technology services company that partners with global organizations across industries to achieve digital transformation. With a strong track record of innovation, investment in digital solutions, and commitment to client success, at Zensar, you can help clients achieve...

Senior Pyspark Data Engineer (big Data, Cloud Data Solutions) @ Synechron

Home > Software Development