Keyskills: python pyspark machine learning ml statistics hive continuous integration github supply chain natural language processing ci/cd microsoft azure apache pig artificial intelligence sql docker data bricks git spark gcp devops oracle adf jenkins aws