Key Responsibilities
- Python & PySpark:
- Writing efficient ETL (Extract, Transform, Load) pipelines.
- Implementing data transformations using PySpark DataFrames and RDDs.
- Optimizing Spark jobs for performance and scalability.
- Apache Spark:
- Managing distributed data processing.
- Implementing batch and streaming data processing.
- Tuning Spark configurations for efficient resource utilization.
- Unix Shell Scripting:
- Automating data workflows and job scheduling.
- Writing shell scripts for file management and log processing.
- Managing cron jobs for scheduled tasks.
- Google Cloud Platform (GCP) & BigQuery:
- Designing data warehouse solutions using BigQuery.
- Writing optimized SQL queries for analytics.
- Integrating Spark with BigQuery for large-scale data processing
Keyskills: Big query GCP Python programming Google cloud Etl Pipelines Pyspark API development Unix Shell Scripting MongoDB Spark Kafka event streaming CI/CD.
About US: TEKsystems Global Services (TGS) is the Software services division of TEKsystems. TEKsystems is a $4.3 billion organization with over 150 offices across the globe. TGS accounts for $600 Million and has about 5000 full time employees across the globe of which 1200 are in India. TGS ope...