TelecomSent: Targeted Sentiment Analysis for Telecoms

Project Overview

This project focuses on extracting detailed and actionable insights from social media discussions about three major telecom operators: MTN. The analysis is based on data from two primary social media platforms: Twitter and Facebook.

We employ both traditional machine learning methods and state-of-the-art deep learning techniques, including BERT, to automatically identify and extract key descriptors from user opinions. These descriptors are then used to generate structured summaries of the sentiments expressed, which can be utilized by telecom companies to identify customer pain points and measure performance against competitors. Similarly, customers can use these summaries to make informed choices about their telecom providers.

For supervised learning tasks, we developed a custom human-annotated dataset, referred to as TelecomSent, containing 5,423 social media posts. Each post references one or more telecom providers, offering a rich dataset for sentiment analysis.

The core components extracted from these posts include the target telecom, the specific service aspect mentioned, and the sentiment expressed towards that aspect. This methodology falls under Targeted Aspect-Based Sentiment Analysis (TABSA).

Python 3.6+
TensorFlow
Access to a GPU (or use Google Colab)
Scikit-learn
BERT-Base (Google's pre-trained models)
NLTK (Natural Language Toolkit)
NumPy 1.15.4
PyTorch 1.0.0

Results Summary

The table below summarizes the results achieved using various machine learning and deep learning approaches. We evaluated the models using strict accuracy, Macro-F1 score, and AUC, with results provided for both aspect category detection and sentiment classification.

Model	Aspect Accuracy	Aspect F1	Aspect AUC	Sentiment Accuracy	Sentiment AUC
RF-TFIDF	0.540	0.392	0.615	0.958	0.737
RF-word2vec	0.391	0.115	0.538	0.956	0.533
LR-TFIDF	0.390	0.414	0.532	0.877	0.508
LR-word2vec	0.365	0.229	0.482	0.918	0.487
LSTM	0.705	0.231	-	0.705	-
BERT	0.748	0.791	0.963	0.937	0.961

Running the Models

You can run the models by accessing the respective Jupyter notebooks provided in the Scripts directory.

Random Forest with TFIDF: Run Notebook
Random Forest with Word2Vec: Run Notebook
Logistic Regression with TFIDF: Run Notebook
Logistic Regression with Word2Vec: Run Notebook
BERT Implementation: Run Notebook
LSTM Model: Explore Code

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
Data_Collection		Data_Collection
Dataset		Dataset
Models		Models
Scripts		Scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TelecomSent: Targeted Sentiment Analysis for Telecoms

Project Overview

Results Summary

Running the Models

About

Releases

Packages

Languages

AliAmini93/TelecomSent

Folders and files

Latest commit

History

Repository files navigation

TelecomSent: Targeted Sentiment Analysis for Telecoms

Project Overview

Results Summary

Running the Models

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages