Job Description
At Capgemini Invent, we believe difference drives change. As inventive transformation consultants, we blend our strategic, creative and scientific capabilities,collaborating closely with clients to deliver cutting-edge solutions. Join us to drive transformation tailored to our client's challenges of today and tomorrow.Informed and validated by science and data. Superpowered by creativity and design. All underpinned by technology created with purpose.
Your role As a Senior Data Scientist, you are expected to develop and implement Artificial Intelligence based solutions across various disciplines for the Intelligent Industry vertical of Capgemini Invent. You are expected to work as an individual contributor or along with a team to help design and develop ML/NLP models as per the requirement. You will work closely with the Product Owner, Systems Architect and other key stakeholders right from conceptualization till the implementation of the project. You should take ownership while understanding the client requirement, the data to be used, security & privacy needs and the infrastructure to be used for the development and implementation. The candidate will be responsible for executing data science projects independently to deliver business outcomes and is expected to demonstrate domain expertise, develop, and execute program plans and proactively solicit feedback from stakeholders to identify improvement actions. This role requires a strong technical background, excellent problem-solving skills, and the ability to work collaboratively with stakeholders from different functional and business teams.
The role also requires the candidate to collaborate on ML asset creation and eager to learn and impart trainings to fellow data science professionals. We expect thought leadership from the candidate, especially on proposing to build a ML/NLP asset based on expected industry requirements. Experience in building Industry specific (e.g. Manufacturing, R&D, Supply Chain, Life Sciences etc), production ready AI Models using microservices and web-services is a plus. Programming Languages Python NumPy, SciPy, Pandas, MatPlotLib, Seaborne
Databases RDBMS (MySQL, Oracle etc.), NoSQL Stores (HBase, Cassandra etc.)
ML/DL Frameworks SciKitLearn, TensorFlow (Keras), PyTorch,
Big data ML Frameworks - Spark (Spark-ML, Graph-X), H2O
Cloud Azure/AWS/GCP
Your Profile Predictive and Prescriptive modelling using Statistical and Machine Learning algorithms including but not limited to Time Series, Regression, Trees, Ensembles, Neural-Nets (Deep & Shallow CNN, LSTM, Transformers etc.). Experience with open-source OCR engines like Tesseract, Speech recognition, Computer Vision, face recognition, emotion detection etc. is a plus.
Unsupervised learning Market Basket Analysis, Collaborative Filtering, Dimensionality Reduction, good understanding of common matrix decomposition approaches like SVD. Various Clustering approaches Hierarchical, Centroid-based, Density-based, Distribution-based, Graph-based clustering like Spectral.
NLP Information Extraction, Similarity Matching, Sentiment Analysis, Text Clustering, Semantic Analysis, Document Summarization, Context Mapping/Understanding, Intent Classification, Word Embeddings, Vector Space Models, experience with libraries like NLTK, Spacy, Stanford Core-NLP is a plus. Usage of Transformers for NLP and experience with LLMs like (ChatGPT, Llama) and usage of RAGs (vector stores like LangChain & LangGraps), building Agentic AI applications.
Model Deployment ML pipeline formation, data security and scrutiny check and ML-Ops for productionizing a built model on-premises and on cloud. Required Qualifications
Masters degree in a quantitative field such as Mathematics, Statistics, Machine Learning, Computer Science or Engineering or a bachelors degree with relevant experience.
Good experience in programming with languages such as Python/Java/Scala, SQL and experience with data visualization tools like Tableau or Power BI. Preferred Experience
Experienced in Agile way of working, manage team effort and track through JIRA
Experience in Proposal, RFP, RFQ and pitch creations and delivery to the big forum.
Experience in POC, MVP, PoV and assets creations with innovative use cases
Experience working in a consulting environment is highly desirable.
Presupposition High Impact client communication
The job may also entail sitting as well as working at a computer for extended periods of time. Candidates should be able to effectively communicate by telephone, email, and face to face.
What you will love about working here We recognize the significance of flexible work arrangements to provide support. Be it remote work, or flexible work hours, you will get an environment to maintain healthy work life balance. At the heart of our mission is your career growth. Our array of career growth programs and diverse professions are crafted to support you in exploring a world of opportunities. Equip yourself with valuable certifications in the latest technologies such as Generative AI. Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Data Science & Analytics
Role Category: Data Science & Machine Learning
Role: Data Scientist
Employement Type: Full time
Contact Details:
Company: Capgemini
Location(s): Noida, Gurugram
Keyskills:
numpy
sql
java
python
pandas
scala
poc
nltk
dl
artificial intelligence
tensorflow
spacy
spark
gcp
pytorch
keras
mysql
hbase
ml
jira
scipy
rdbms
oracle
mvp
microsoft azure
power bi
nosql
tableau
cassandra
matplotlib
agile
aws