Job Title: Data Scientist Classic ML, Deep Learning & NLP (Immediate Joiner)
Location: Gurugram/Bangalore
Department: Data Science / AI & Analytics
Experience: 3-5 years
Employment Type: Full-Time
Job Summary
We are seeking a highly skilled Data Scientist with strong expertise in Machine Learning, Deep Learning, and Natural Language Processing (NLP). The ideal candidate will have hands-on experience designing, developing, and deploying AI/ML models that solve real-world problems, optimize decision-making, and drive innovation.
Key Responsibilities
Data Analysis & Preprocessing:
Collect, clean, transform, and analyze large, structured and unstructured datasets using Python, SQL, and data wrangling tools.
Model Development:
Build and optimize machine learning models (classification, regression, clustering, recommendation systems) using algorithms such as Random Forest, XGBoost, SVM, etc.
Deep Learning & NLP:
Design and train deep neural networks using frameworks like TensorFlow or PyTorch.
Work on NLP applications such as text classification, entity recognition, sentiment analysis, summarization, and large language models (LLMs).
Fine-tune transformer-based architectures (e.g., BERT, GPT, T5, LLaMA).
Feature Engineering & Model Evaluation:
Perform feature extraction, selection, and model validation using statistical and ML evaluation metrics.
Deployment & Integration:
Implement and deploy models into production using Docker, FastAPI, Flask, MLflow, or AWS Sagemaker.
Research & Innovation:
Stay up to date with latest advancements in AI/ML and NLP; contribute to POCs, patents, or research publications.
Collaboration & Communication:
Work closely with data engineers, product teams, and business stakeholders to translate analytical insights into strategic solutions.
Required Skills & Tools
Programming & Libraries:
Python (NumPy, Pandas, Scikit-learn, TensorFlow, PyTorch, NLTK, SpaCy, Transformers)
SQL and data querying skills
Familiarity with R (optional)
Machine Learning:
Regression, Classification, Clustering, Dimensionality Reduction, Feature Engineering
Model selection, hyperparameter tuning, cross-validation
Deep Learning:
CNNs, RNNs, LSTMs, Transformers
Generative AI (LLMs, diffusion models, embeddings, prompt engineering preferred)
NLP:
Text preprocessing, tokenization, sentiment analysis, entity recognition
LLM fine-tuning, embeddings, vector databases (e.g., Pinecone, FAISS)
Tools & Platforms:
Git / GitHub, Jupyter, MLflow, Docker, Streamlit, FastAPI
Cloud Platforms: AWS / GCP / Azure
Versioning and CI/CD pipelines for model deployment
Education
Bachelors or Masters degree in Computer Science, Data Science, AI/ML, Statistics, or related field.

Keyskills: NLP Vector Db Large Language Model Machine Learning Deep Learning Pytorch Tensorflow Scikit-Learn Python