Mandatory
1. Building applications for data preparation that includes impurities removal, anomaly detection, identifying inconsistencies and tranformations
2. Building applications for data exploration that includes missing value imputation, outlier analysis, class imbalance, correlation, and visualization
3. Building applications for feature engineering that includes feature generation, feature transformation, feature selection
4. Building applications for machine learning modeling that includes models development, hyperparameter optimization, model selection, training, validation and prediction
5. Building applications for machine learning automation that includes automating components included in each of the applications and automating integrated data science pipeline
6. Building sophisticated deterministic, stochastic and neural network models from scratch, in map reduce paradigm enabling distrubuted computing
Preferred
1. Functional programming in Python on vinaigrette map-reduce lambda paradigm
2. Complex mathematical logics through PySpark at scale on parallel/distributed clusters
3. Worked on development of data platform
4. Worked on TensorFlow, PyTorch, Keras
Keyskills: Automation data science Machine learning HTML HTTP Natural language processing Data analytics Business intelligence Apache Python