CaMEL: Mean Teacher Learning for Image Captioning

This repository contains the reference code for the paper CaMEL: Mean Teacher Learning for Image Captioning.

Please cite with the following BibTeX:

@inproceedings{barraco2022camel,
  title={{CaMEL: Mean Teacher Learning for Image Captioning}},
  author={Barraco, Manuele and Stefanini, Matteo and Cornia, Marcella and Cascianelli, Silvia and Baraldi, Lorenzo and Cucchiara, Rita},
  booktitle={International Conference on Pattern Recognition},
  year={2022}
}

Environment setup

Clone the repository and create the camel_release conda environment using the environment.yml file:

conda env create -f environment.yml
conda activate camel_release

Note: Python 3.8 is required to run our code.

Data preparation

To run the code, annotations and images for the COCO dataset are needed. Please download the zip files containing the images (train2014.zip, val2014.zip), and the annotations (annotations.zip) and extract them. These paths will be set as arguments later.

Evaluation

To reproduce the results reported in our paper, download the pretrained model file camel_mesh.pth or camel_nomesh.pth and place it anywhere. Its path will be set as argument later.

Run python evaluation.py using the following arguments:

Argument	Possible values
`--batch_size`	Batch size (default: `25`)
`--workers`	Number of workers (default: `0`)
`--resume_last`	If used, the training will be resumed from the last checkpoint
`--resume_best`	If used, the training will be resumed from the best checkpoint
`--annotation_folder`	Path to folder with COCO annotations (required)
`--image_folder`	Path to folder with COCO images (required)
`--saved_model_path`	Path to model weights file (required)
`--clip_variant`	CLIP variant to be used as image encoder (default: `RN50x16`)
`--network`	Network to be used in the evaluation, `online` or `target` (default: `target`)
`--disable_mesh`	If used, the model does not employ the mesh connectivity
`--N_dec`	Number of decoder layers (default: `3`)
`--N_enc`	Number of encoder layers (default: `3`)
`--d_model`	Dimensionality of the model (default: `512`)
`--d_ff`	Dimensionality of Feed-Forward layers (default: `2048`)
`--m`	Number of memory vectors (default: `40`)
`--head`	Number of heads (default: `8`)

For example, to evaluate our model, use

python evaluate.py --image_folder /path/to/images --annotation_folder /path/to/annotations --saved_model_path /path/to/model_file.pth

Training procedure

Run python train.py using the following arguments:

Argument	Possible values
`--exp_name`	Experiment name (default: `camel`)
`--batch_size`	Batch size (default: `25`)
`--workers`	Number of workers (default: `0`)
`--resume_last`	If used, the training will be resumed from the last checkpoint
`--resume_best`	If used, the training will be resumed from the best checkpoint
`--annotation_folder`	Path to folder with COCO annotations (required)
`--image_folder`	Path to folder with COCO images (required)
`--clip_variant`	CLIP variant to be used as image encoder (default: `RN50x16`)
`--distillation_weight`	Weight for the knowledge distillation loss (default: `0.1` in XE phase, `0.005` in SCST phase)
`--ema_weight`	Target decay rate of Mean Teacher paradigm (default: `0.999`)
`--phase`	Training phase, `xe` or `scst` (default: `xe`)
`--disable_mesh`	If used, the model does not employ the mesh connectivity
`--saved_model_file`	If used, path to model weights to be loaded
`--N_dec`	Number of decoder layers (default: `3`)
`--N_enc`	Number of encoder layers (default: `3`)
`--d_model`	Dimensionality of the model (default: `512`)
`--d_ff`	Dimensionality of Feed-Forward layers (default: `2048`)
`--m`	Number of memory vectors (default: `40`)
`--head`	Number of heads (default: `8`)
`--warmup`	Warmup value for learning rate scheduling (default: `10000`)

For example, to train our model with the parameters used in our experiments, use

python train.py --image_folder /path/to/images --annotation_folder /path/to/annotations

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
evaluation		evaluation
images		images
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
evaluation.py		evaluation.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

evaluation

evaluation

images

images

models

models

utils

utils

LICENSE

LICENSE

README.md

README.md

environment.yml

environment.yml

evaluation.py

evaluation.py

train.py

train.py

Repository files navigation

CaMEL: Mean Teacher Learning for Image Captioning

Environment setup

Data preparation

Evaluation

Training procedure

About

Releases

Packages

Contributors 2

Languages

License

aimagelab/camel

Folders and files

Latest commit

History

Repository files navigation

CaMEL: Mean Teacher Learning for Image Captioning

Environment setup

Data preparation

Evaluation

Training procedure

About

Topics

Resources

License

Stars

Watchers

Forks

Languages