ADF Technical LeadCST 2017 JD Ver 2.0
Job Description
**mandatory
SNRequired InformationDetails1Role**Technical Lead/Data Engineer2Required Technical Skill Set**Denodo, Azur Databricks, ADF, Snowflake, SQL, PySpark3No of Requirements**2
2 (8+ Years)4Desired Experience Range**8 - 10 Years5Location of Requirement PAN India ( Noida, Bengluru Preferred)
Desired Competencies (Technical/Behavioral Competency)Must-Have**
(Ideally should not be more than 3-5)
Good-to-Have
SNResponsibility of / Expectations from the Role 1Responsible for supporting and managing Source domain data flows i.e. Pipelines/Databricks2Support of Views (Integrations), VDBs and various source domains build on top of Denodo3Understanding od Source System and Data4Performing validation and testing post new depoloyment. 5Supporting on Azure Data Factory pipelines, ADB transformations, Databricks
6Debugging and resolving pipelines failures/ data issues.
7Establishing initial setup for the projects, resolving connectivity issues and performance tuning
Details For Candidate Briefing**
(It should NOT contain any confidential information or references)About Client: (Please highlight domain, any ranking indicating size of /recognition for the client. It helps sell the role to prospective candidates)
The Client is one of the largest pharmaceutical company of the world that focus on research and development of broad range of innovative products in three primary areas of Pharmaceuticals and Vaccines
mple: The Client is one of the top 5 telecom service provider of Europe or, it is a BFSI client which figures in Forbes global top 100 and having a presence in 60 countries, spread across 4 continents)USP of the Role and Project:(Why a candidate would be keen to apply for this role in this project what there for me?. Please highlight the USP of the role in terms of nature of the project, technology involved, growth prospect, onsite opportunity, opportunity for learning and certifications etc. It helps sell the role to prospective candidates)The engagement will provide a strong platform for candidates to apply/put their Data Engineering skills and strengthen them further for a successful Data Engineer career. It is a challenging role involving support of DDF domains.
the next gen billing system involving latest technologies like xyz.
Subject Matter Expert (SME)**For faster and for better search and screening, it is important for Sourcing team to understand the technical expectation from the propective candidate. So please provide the coordinates of few SMEs from your project, who shall take interviews of such candidates/review their CVs in future. Our experts may connect with these SMEs for a brief discussion over webex.NameEmail IDContact NumberVinay Pa****************2@***.com+91-9719141***
Market Intelligence Vendor Approval**Please specify companies that are competition for TCS for this particular project / any other companies where these skills would be available.Accenture,CognizantIn the past, many associates with similar skill may have joined us through external hiring and still continuing in the project. Would like to know details of such associates (name, emp number). This shall help us to focus on target companies betterNAPlease specify if project is open to share the requirements with vendors (Y/N)**Yes
To check on Must-have skills, screenin questions, if any:Q1. What is Denodo and data virtualization?
Ans: Denodo is a data virtualization platform that provides real-time access to disparate data sources without physically moving or copying the data. It creates a unified, virtual data layer, enabling users to access and integrate data from various sources as if it were a single source.Q2. What is Azure Databricks, and how does it integrate with Azure?Ans: Azure Databricks is a data analytics and AI-based service offered by Microsoft Azure. It unifies data, the data ecosystem, and data teams. It is integrated with multiple Azure environments, such as Azure Data Lake Storage, Power BI, Azure Synapse Analytics, Azure Data Factory, and others, for advanced solutions and enhanced performanceQ3. What are the different ways to execute pipelines in Azure Data Factory?Ans: There are three ways in which we can execute a pipeline in Data Factory:
Q4. What types of SQL commands (or SQL subsets) do you know?Ans: Below are the main type:
Q5. Is Snowflake an ETL tool?
Ans: Yes, Snowflake is an ETL tool. Its a three-step process, which includes:
Extracts data from the source and creates data files. Data files support multiple data formats like JSON, CSV, XML, and more.
Loads data to an internal or external stage. Data can be staged in an internal, Microsoft Azure blob, Amazon S3 bucket, or Snowflake managed location.
Data is copied into a Snowflake database table using the COPY INTO commandQ6. What are the key differences between RDDs, DataFrames, and Datasets in PySpark?Ans: Spark Resilient Distributed Datasets (RDD), DataFrame, and Datasets are key abstractions in Spark that enable us to work with structured data in a distributed computing environment. Even though they are all ways of representing data, they have key differences:

Keyskills: Azure Data Factory Data Bricks SQL
\n\nMavlers is a full-service digital marketing agency that has propelled growth for over 7,000 brands and agencies worldwide. As Google, Mailchimp, WP VIP, Microsoft, Salesforce, and HubSpot partners, we possess the expertise to deliver high-impact projects and campaigns tailored to our clients uni...