Job Description
*
The candidate must have experience, preferably in the financial services industry working on all aspects of SDLC with focus on large volume data processing in Python/Java and Relational Databases, applying complex business rules, performing transformation, aggregation and designing data persistence model for Transactional Data Processing (OLTP) as well as Data Warehouse. Must have strong ability to work on multiple IT projects simultaneously, working with the stake holders, clients and other system owners to drive the projects to successful completion. Must have previous involvement in data modeling and/or systems architecture.
Responsibilities will include technical analysis, design, development and perform enhancements. The candidate will participate in following activities:
- Hands-on development experience working with large data volumes, building distributed processing using Python or Java, data frames, APIs, externalizing business rules etc. Understand parallel data processing concepts to build scalable process.
- Using APIs for data exchange, knows how to work with multiple data structures and formats e.g. JSON, flat files, Parquet, Ork etc. Mindset to make meta-data/config driven setup for ease of change.
- Strong knowledge of database table partitions, data distribution, parallel loads and extracts working on a relational database (Db2, Greenplum or other technologies)
- Build batch vs. event driven data processing. Design for high level of concurrency to ensure data integrity without compromising performance
- Working in an agile squad as a contributor, working with other developers towards a common goal
- Modeling and implementing database schemas
- Coming up with system architecture/re-design proposals for greater efficiency and ease of maintenance and developing software to turn proposals into implementations
Skills Required:
PySpark Developer, Capital Market
Function: Operations
India
With a startup spirit and 80,000+ curious and courageous minds, we have the expertise to go deep with the world s biggest brands and we have fun doing it. Now, we re calling all you rule-breakers and risk-takers who see the world differently, and are bold enough to reinvent it. Come, transform with us.
Inviting applications for the role of PySpark Developer
The candidate would-be hands-on Hadoop/Spark/PySpark developer, actively participating in Agile Process, keen on learning new technologies, setting high standards for himself and other in the team. The candidate will have excellent technical writing and communication skills.
- Work with multiple business teams to fully understand business requirements and translate them into data structures.
- Develop technical standards, procedures, and guidelines
- Create and maintain technical documentation, architecture designs and data flow diagrams.
- Constantly improving SDLC process, actively participating in adopting best industry practices
- Interface with business professionals, application developers and technical staff working in an agile process and environment
- Strong understanding of Big Data stack including PySpark, Spark, Hive, HBase, Hadoop, Kafka, Map Reduce, Impala
- Strong SQL experience
- Good knowledge on Spark Architecture
- In depth knowledge of PySpark API
- Experience working on Hadoop/HDFS and HIve
- Experience on various design, patterns and good coding practices of Python
- Experience implementing micro-services
- Good to have cloud expericne
- Experience implementing Cloud-based services GCP is prefered
- Ability to learn quickly, manage work independently and is a team player
- Implements and ensures technical staff adheres to overall IT system and security policies and standards.
- Experience in data stores (both transactional and non-transactional) and can code in a highly concurrent environment.
- Experience building BI systems, analytics and business intelligence platforms
- Experience in the Banking and financial industry
Qualifications
Minimum qualifications
BTech and Masters / Bachelor in Computer Science
Robust analytics experience
Employement Category:
Employement Type: Full time
Industry: IT - Software
Role Category: General / Other Software
Functional Area: Not Applicable
Role/Responsibilies: Consultant -PySpark+Hadoop Developer
Contact Details:
Company: Genpact India
Location(s): Bengaluru