Desired Candidate Profile
JD :
- Build large scale data processing and analytic platforms to address business needs
- Responsible for developing, maintaining, and testing data solutions with a wide variety of data platforms including relational databases, big data platforms and no-sql databases
- Develop various data ingestion & transformation routines to acquire data from internal and external data sources, manage distributed crawlers to parse data from web sources, and develop APIs for secure exchange of data
- Passionate for continuous learning, experimenting, applying and contributing towards cutting edge open source technologies and software paradigms
- Stays current with advancements in data processing platforms and understand trade-offs
- Liaise with people managing Data Analytics at group companies to understand business problems and needs, secure the data supply chain, implement analysis solutions and visualize outcomes that support improved decision making for a customer
- Enable Machine Learning and other Data Science capabilities on Hadoop leveraging Spark etc.
- Balance important criteria such as operational cost, maintenance, scalability, and deployability
- Process Automation: Eliminate manual dependency in Data Integration process by heavily automating using Python or other scripting languages
- Creates the design specification, deployment plans, and other technical documents for respective activities
Qualifications :
- 4+ years of relevant work experience in Data Engineering
- Experience at developing highly scalable solutions that use cloud/on premise solutions for processing very large data sets in Production. Experience in AWS/Azure a plus.
- Experience of using IaaS / PaaS / SaaS
- Must have hands on experience and expert command in at least two Big Data technologies other than Hive and Sqoop
- Map Reduce Programming; Spark Streaming; NoSQL: HBase/Cassandra; Search: Solr/Elastic; Flume; Kafka etc.
- Expertise in at least one programming language: Java, Python, Scala etc.
- Experience in working with unstructured data sources and streaming data sets
Education:
UG: B.Tech/B.E. - Any Specialization
PG: Any Postgraduate - Any Specialization
Doctorate: Doctorate Not Required
Contact Details:
Keyskills:
Hadoop
Hive
Spark
Flume
Sqoop
Java
SCALA
Data Science
Cassandra
NoSQL