This project presents a resource-efficient approach to sentiment analysis for social media platforms by leveraging textual content, emojis, and hashtags.
The objective is to build a computationally efficient yet accurate multi-sentiment classification system using a traditional machine learning algorithm — Random Forest Classifier.
Unlike deep learning models that require heavy computational resources, this study demonstrates that optimized machine learning techniques combined with intelligent feature engineering can achieve strong performance while remaining lightweight and scalable.
- Perform multi-class sentiment classification on social media data
- Incorporate emoji-based sentiment understanding
- Utilize hashtags as contextual sentiment indicators
- Develop a resource-efficient ML pipeline
- Evaluate model performance using multiple evaluation metrics
The model classifies input text into the following sentiment classes:
- Positive
- Negative
- Neutral
- Angry
- Joyful
- Sad
- Fearful
- Surprised
- Love
- Sarcastic
- Text normalization
- Removal of noise and special characters
- Stopword removal
- Tokenization and lemmatization
- Emoji extraction and sentiment mapping
- Hashtag identification and processing
- TF-IDF Vectorization
- Emoji sentiment scoring
- Hashtag frequency encoding
- Feature combination for improved representation
- Algorithm: Random Forest Classifier
- Training and validation split
- Hyperparameter optimization
- Efficient feature selection
Performance evaluated using:
- Accuracy
- Precision
- Recall
- F1-Score
- Confusion Matrix
- ROC-AUC Curve
- Python
- Scikit-learn
- Pandas
- NumPy
- Matplotlib
- Seaborn
- NLTK / SpaCy
- Emoji processing libraries
- Achieved high multi-class classification performance
- Emoji and hashtag integration significantly improved sentiment prediction
- Random Forest provided strong accuracy with reduced computational cost
- Demonstrated effectiveness of traditional ML models for NLP tasks
- Multi-sentiment classification beyond binary sentiment analysis
- Integration of emoji intelligence into NLP workflow
- Hashtag-aware sentiment modeling
- Resource-efficient alternative to deep learning models
- Practical implementation suitable for real-world deployment
Resource-Efficient-Sentiment-Analysis │ ├── dataset/ # Input datasets (sample data) │ ├── notebooks/ # Jupyter notebooks for experimentation │ └── sentiment_analysis.ipynb │ ├── src/ # Source code files │ ├── preprocessing.py │ ├── feature_engineering.py │ ├── model_training.py │ └── evaluation.py │ ├── model/ # Saved trained models │ └── random_forest_model.pkl │ ├── outputs/ # Generated plots and results │ ├── confusion_matrix.png │ ├── accuracy_plot.png │ └── roc_curve.png │ ├── requirements.txt # Project dependencies ├── README.md # Project documentation └── main.py # Main execution script
git clone https://github.com/yourusername/resource-efficient-sentiment-analysis.git
cd resource-efficient-sentiment-analysis
pip install -r requirements.txt
python main.py
- Real-time sentiment monitoring dashboard
- Deployment using Flask or FastAPI
- Integration with Twitter/X or Instagram APIs
- Deep learning comparison (LSTM, BERT)
- Multilingual sentiment analysis support
Rudrani Mukherjee
B.Tech – Computer Science & Engineering (Data Science)
Machine Learning | NLP | Data Analytics Enthusiast
This project is developed for academic and research purposes.
This work explores efficient machine learning solutions for sentiment understanding in modern social media communication environments.