- Little Birdy is a web app where in the user can put in any person, object, movie, product name and the web app will provide them with all the detailed analysis about that specific thing based on the real-time tweets on Twitter.
- The analysis/overview includes, tweets, its sentiment, hashtags, word cloud for the respective sentiment, and a graph for the ratio of sentiments
- You can try out our web app by clicking Little Birdy
2022-09-09.22-55-00.mp4
Sentiment140 dataset with 1.6 million tweets
was used for developing the model which was customly preprocessed to a very highh standards.
- Click here to get the dataset.
- ~5k null values were removed
- We tried to remove even the fine grained spelling error, unexpected characters and more...
- Performed Lemmatization to convert the word into its root form
- For the stopwards part, we tried to remove majority of 1-2 letter alphabets. Then checked for top 500 high frequency words in the whole dataset and manually removed the words which was not needed or were biased
- Tokenized and padded the dataset
- Converted word into vectors using FastText word embedding(implemented Glove and BERT as well)
- Splitted into Train set and Test set in 95:5 ratio
- We tried working with three different word embeddings, namely, FastText, GloVe, and BERT
- BERT didn't work out due to limited resources
- Accuracy with the other two embeddings were similar so we to carry out experiment by judging on the output of the two model given a specific sentance
- We tried two models, LSTM and LSTM-CNN hybrid model out of which LSTM showed comparitively better results
- Achieved an accuracy of 83.98% on Train set and 83.51% on Test set
- Then we saved the model as well as the tokenizer
- Created a flask app and front-end using react and hosted locally
- Then we deployed the web app on Microsoft Azure using VM
- Accuracy graph
- Loss graph
- Confusion matrix
- F1 score, Precision, and Recall