The Emotions of the Crowd: Learning Image Sentiment from Tweets via Cross-modal Distillation

Introduction

This work introduced a cross-modal learning method to train visual models for Sentiment Analysis in the Twitter domain.

It was used to fine-tune the Vision-Transformer (ViT) model pre-trained on Imagenet-21k, which was able to achieve incredible results on external benchmarks which were manually annotated, even beating the current State Of The Art!

We crawled ∼3.7 pictures from social media from 1 April to 30 June to use them in our cross-modal approach. In particular, a cross-modal teacher-student learning technique was used to avoid human annotators, thus minimizing the efforts required and allowing for the creation of vast training sets.

The latter can help future research to train robust visual models, as the number of parameters of current SOTA is exponentially growing along with their need of data to avoid overfitting problems.

Build Instructions

Clone codebase

$ git config --global http.postBuffer 1048576000
$ git clone --recursive https://github.com/fabiocarrara/cross-modal-visual-sentiment-analysis/tree/master

Install dependencies

$ chmod +x install_dependencies.sh
$ ./install_dependencies.sh

How to use the scripts for benchmark evaluation

Test a model with a benchmark, get the accuracy and save the prediction

$ python3 scripts/test_benchmark.py -m <model_name> -b <benchmark_name>

$ options for <model_name>: [boosted_model, ViT_L16, ViT_L32, ViT_B16, ViT_B32, merged_T4SA, bal_flat_T4SA2.0, bal_T4SA2.0, unb_T4SA2.0, B-T4SA_1.0_upd_filt, B-T4SA_1.0_upd, B-T4SA_1.0]
$ options for <benchmark_name>: [5agree, 4agree, 3agree, FI_complete, emotion_ROI_test, twitter_testing_2]

Execute a five fold cross validation on a benchmark, get the mean accuracy, the standard deviation and save the predictions (by default use the boosted_model)

$ python3 scripts/5_fold_cross.py -b <benchmark_name>

Fine tune FI on the five split, get the mean accuracy, the standard deviation and save the predictions (by default use the boosted_model)

$ python3 scripts/fine_tune_FI.py

Trained Models

		Confidence Filter Threshold				Accuracy on Twitter Dataset (TD)
Label	Dataset	Pos	Neu	Neg	Student Arch	5 agree	$\ge$ 4 agree	$\ge$ 3 agree
Model 3.1	A	-	-	-	B/32	82.2	78.0	75.5
Model 3.2	A	0.70	0.70	0.70	B/32	84.7	79.7	76.6
Model 3.3	B	0.70	0.70	0.70	B/32	82.3	78.7	75.3
Model 3.4	B	0.90	0.90	0.70	B/32	84.4	80.3	77.1
Model 3.5	A+B	0.90	0.90	0.70	B/32	86.5	82.6	78.9
Model 3.6	A+B	0.90	0.90	0.70	L/32	85.0	82.4	79.4
Model 3.7	A+B	0.90	0.90	0.70	B/16	87.0	83.1	79.4
Model 3.8	A+B	0.90	0.90	0.70	L/16	87.8	84.8	81.9

Data

COMING SOON

BibTeX

@inproceedings{serra2023emotions,
  author    = {Serra, Alessio and Carrara, Fabio and Tesconi, Maurizio and Falchi, Fabrizio},
  editor       = {Kobi Gal and Ann Now{\'{e}} and Grzegorz J. Nalepa and Roy Fairstein and Roxana Radulescu},
  title        = {The Emotions of the Crowd: Learning Image Sentiment from Tweets via Cross-Modal Distillation},
  booktitle    = {{ECAI} 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Krak{\'{o}}w, Poland - Including 12th Conference on Prestigious Applications of Intelligent Systems ({PAIS} 2023)},
  series       = {Frontiers in Artificial Intelligence and Applications},
  volume       = {372},
  pages        = {2089--2096},
  publisher    = {{IOS} Press},
  year         = {2023},
  url          = {https://doi.org/10.3233/FAIA230503},
  doi          = {10.3233/FAIA230503},
}

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
dataset		dataset
models		models
notebooks		notebooks
predictions		predictions
scripts		scripts
README.md		README.md
install_dependencies.sh		install_dependencies.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Emotions of the Crowd: Learning Image Sentiment from Tweets via Cross-modal Distillation

Introduction

Build Instructions

Clone codebase

Install dependencies

How to use the scripts for benchmark evaluation

Test a model with a benchmark, get the accuracy and save the prediction

Execute a five fold cross validation on a benchmark, get the mean accuracy, the standard deviation and save the predictions (by default use the boosted_model)

Fine tune FI on the five split, get the mean accuracy, the standard deviation and save the predictions (by default use the boosted_model)

Trained Models

Data

BibTeX

About

Uh oh!

Releases

Packages

Languages

atakan2024/cross-modal-visual-sentiment-analysis

Folders and files

Latest commit

History

Repository files navigation

The Emotions of the Crowd: Learning Image Sentiment from Tweets via Cross-modal Distillation

Introduction

Build Instructions

Clone codebase

Install dependencies

How to use the scripts for benchmark evaluation

Test a model with a benchmark, get the accuracy and save the prediction

Execute a five fold cross validation on a benchmark, get the mean accuracy, the standard deviation and save the predictions (by default use the boosted_model)

Fine tune FI on the five split, get the mean accuracy, the standard deviation and save the predictions (by default use the boosted_model)

Trained Models

Data

BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages