Design and build high performing and scalable data processing systems to support multiple internal and 3rd party data pipelines
Write Scala/Spark jobs for data transformation, aggregation, ETL and Machine Learning.
Tuning spark/scala jobs and performance optimization
Responsible for Design, Coding, Unit Testing, and other SDLC activities in a big data environment
Requirement gathering and understanding, Analyze and convert functional requirements into concrete technical tasks and able to provide reasonable effort estimates
Work proactively, independently and with global teams to address project requirements, and articulate issues/challenges with enough lead time to address project delivery risks
Qualification:
Minimum 3 years of hands-on experience in Spark/Scala with the overall development experience of 4-8 years in RDBMS systems.
In-Depth knowledge of Scala and Spark components, ecosystem is a must
Strong knowledge in distributed systems and solid understanding of Big Data Systems in the Hadoop Ecosystem.
Experience with integration of data from multiple data sources (RDBMS, API)
Experience with AWS or other cloud vendors
Experience in developing and deploying large scale distributed applications.
Additional Skills:
Knowledge in AWS stacks AWS Glue, S3, SQS
Exposure to Elastic Search, Solr is a plus
Exposure to NoSql Databases Cassandra, MongoDB
Exposure to Serverless computing
Education:
UG: B.Tech/B.E. - Any Specialization, Computers, BCA - Computers, B.Sc - Any Specialization
PG: M.Tech - Any Specialization, MS/M.Sc(Science) - Any Specialization, MCA - Computers