I applied different classification models to a test dataset collected from commoncrawl(WET files for July 2019) to predict the overall sentiment toward Apple's iPhone and Samsung's Galaxy using 2 labeled training sets.
Apart from the goal of conducting sentiment analysis. I used this project as an opportunity to discover automatic feature engineering tools available in Python and R and to:
- Compare R MLR and Python ScikitLearn workflows.
- Compare different open source free Automatic Machine Learning Frameworks( Automatic Hyperparameter Tuning and Automatic Model selection )
- Google’s AutoML
- TPOT optimizes pipelines using genetic programming
- H20's AutoML