a) Dataset: https://www.kaggle.com/datasets/saurabhshahane/twitter-sentiment-dataset
b) EDA (Exploratory Data Analysis): EDA.ipynb
Data Visualisation: Wordcloud, Countplot, Frequency
Data Preprocessing: Stop words removal, Lemmatization
c) Classifications Task:
-
tfidf (embedding) + multinomial Naive Bayes based modelling : tfidf_mtltNB.ipynb
-
Word2Vec (embedding) + multinomial Naive Bayes based modelling : Word2vec_mltNB.ipynb
-
Word2Vec (embedding) + deep learning ( 1 input + 1 dense + 1 output layer) based modelling : Word2vec_nn.ipynb