Job Title: Big Data Developer
Location: Navi Mumbai, India
Exp: 5+ Years
Department: Big Data and Cloud
Job Summary: Smartavya Analytica Private Limited is seeking a skilled Hadoop Developer to join our team and contribute to the development and maintenance of large-scale Big Data solutions. The ideal candidate will have extensive experience in Hadoop ecosystem technologies and a solid understanding of distributed computing, data processing, and data management.
Company: Smartavya Analytica Private limited is a niche Data and AI company. Based in Pune, we are pioneers in data-driven innovation, transforming enterprise data into strategic insights. Established in 2017, our team has experience in handling large datasets up to 20 PBs in a single implementation, delivering many successful data and AI projects across major industries, including retail, finance, telecom, manufacturing, insurance, and capital markets. We are leaders in Big Data, Cloud and Analytics projects with super specialization in very large Data Platforms.
https://smart-analytica.com
Empowering Your Digital Transformation with Data Modernization and AI
Requirements:
Min 3 years of experience in developing, testing & implementing Big data projects using Hadoop, Spark, Hive.
Hands-on experience playing lead role in Big data projects, responsible for implementing one or more tracks within projects, identifying and assigning tasks within the team and providing technical guidance to team members.
Experience in setting up Hadoop services, implementing ETL/ELT pipelines, working with Terabytes of data ingestion & processing from varied systems
Experience working in onshore/offshore model, leading technical discussions with customers, mentoring and guiding teams on technology, preparing HDD & LDD documents.
Skills:
Must to have Pyspark, Hadoop ecosystem including Hive, Sqoop, Impala, Oozie, Hue, Java, Python, SQL, bash (shell scripting)
Apache Kafka, Storm, Distributed systems, good understanding of networking, security (platform & data) concepts, Kerberos
Understanding of Data Governance concepts and experience implementing metadata capture, lineage capture, business glossary
Experience implementing CICD pipelines and working experience with tools like SCM tools such as GIT, Bit bucket, etc
Ability to assign and manage tasks for team members, provide technical guidance, work with architects on HDD, LDD, POCs.
Hands on experience in writing data ingestion pipelines, data processing pipelines using spark and sql, experience in implementing SCD type 1 & 2, auditing, exception handling mechanism
Data Warehousing projects implementation with either Java, or Scala based Hadoop programming background.
Proficient with various development methodologies like waterfall, agile/scrum.
Exceptional communication, organization, and time management skills
Collaborative approach to decision-making & Strong analytical skills
Good To Have - Certifications in any of GCP, AWS or Azure, Cloudera 12. Work on multiple Projects simultaneously, prioritizing appropriately

Keyskills: Pyspark SQL Cloudera Hadoop Big Data Hdfs Impala Hive Sqoop Data Engineer Hue Oozie Spark Python