A deep learning project for analyzing multilingual text sentiments (Tamil and Tulu) using transformers and graph-based approaches.
- 🔄 Fine-tuning XLM-RoBERTa for multilingual sentiment classification.
- 🌐 Graph Neural Networks (GNN) for sentiment analysis.
- 🎭 Support for multiple sentiment classes: Positive, Negative, Neutral, Mixed Feelings, etc.
- ⏱️ Training with Early Stopping and Model Checkpointing.
- 📊 F1-Score Based Evaluation for robust performance.
- Create a new conda environment:
conda create -n sentiment python=3.9 conda activate sentiment
- Install basic dependencies:
conda install pytorch torchvision -c pytorch pip install -r requirements.txt
- For graph-based approaches, install additional dependencies:
pip install -r graph_requirements.txt
Install all required packages:
pip install -r requirements.txt
pip install -r graph_requirements.txt # Optional: for graph-based approach
Ensure your dataset files are in the dataset/
folder:
cleaned_tamil_dev.csv
cleaned_tamil_train.csv
Tam-SA-train.csv
Tam-SA-val.csv
Tulu_SA_train.csv
Tulu_SA_val.csv
python main.py
python graph_approach.py
python test_model.py
💡 Use this command to test the trained model on new text.
- Transformer Approach: Fine-tuned XLM-RoBERTa base model.
- Graph Approach: Graph Convolutional Networks (GCN) with custom tokenization.
- 🤖 Transformer Model: ~0.63 F1 score.
- 🌐 Graph Model: Comparable performance with unique strengths.
.
├── dataset/ # Data files
├── main.py # Transformer-based implementation
├── graph_approach.py # GNN-based implementation
├── test_model.py # Model inference code
├── FInetuning_ XLM-RoBERTa.ipynb #Jupyter notebook
├── requirements.txt # Basic dependencies
└── graph_requirements.txt # Additional GNN dependencies
- NVIDIA 4060 GPU with 8GB RAM.
- CUDA 12.1 and PyTorch 2.5.1.
- 🖥️ Tested on Google Colab T4 GPU (15GB GPU RAM).