Universal Adversarial Perturbations for Vision-Language Pre-trained Models

This is the PyTorch implementation of the paper "Universal Adversarial Perturbations for Vision-Language Pre-trained Models" at SIGIR 24.

Requirements

pytorch 1.10.2
transformers 4.8.1
timm 0.4.9
bert_score 0.3.11

Prepare datasets and models

Download the datasets, Flickr30k and MSCOCO (the annotations are provided in ./data_annotation/), and put them into ./Dataset. Set the root path of the dataset in ./configs/Retrieval_flickr.yaml, image_root.

The checkpoints of the fine-tuned VLP models are accessible in CLIP, ALBEF, TCL, BLIP, and put them into ./checkpoint.

Learn universal adversarial perturbations

Set paths of source/target model names and checkpoints, dataset names and roots, test file path, original_rank_index_path and so on in corresponding main files before running them.

# Learn UAPs by taking CLIP as the victim
python Attack_CLIP.py

# Learn UAPs by taking ALBEF/TCL as the victim 
python Attack_ALBEFTCL.py

Evaluation

Image-Text Retrieval

# Eval CLIP models:
python Eval_Retrieval_CLIP.py

# Eval ALBEF models:
python Eval_Retrieval_ALBEF.py

# Eval TCL models:
python Eval_Retrieval_TCL.py

Visual Grounding

Download Refcoco+ datasets from the origin website, and set 'image_root' in configs/Grounding.yaml accordingly.
# Eval:
python Eval_Grounding.py

Image Captioning

Download the MSCOCO dataset from the original websites, and set 'image_root' in configs/caption_coco.yaml accordingly.
# Eval:
python Eval_ImgCap_BLIP.py

Citation

If you find this code to be useful for your research, please consider citing our paper .

@article{zhang2024universal,
  title={Universal Adversarial Perturbations for Vision-Language Pre-trained Models},
  author={Zhang, Peng-Fei and Huang, Zi and Bai, Guangdong},
  journal={arXiv preprint arXiv:2405.05524},
  year={2024}
}

Reference

Co-Attack, SGA, ALBEF, BLIP.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
Dataset		Dataset
attacks		attacks
checkpoint		checkpoint
configs		configs
models		models
refTools		refTools
std_eval_idx		std_eval_idx
Attack_ALBEFTCL.py		Attack_ALBEFTCL.py
Attack_CLIP.py		Attack_CLIP.py
Eval_Grounding.py		Eval_Grounding.py
Eval_ImgCap_BLIP.py		Eval_ImgCap_BLIP.py
Eval_Retrieval_ALBEF.py		Eval_Retrieval_ALBEF.py
Eval_Retrieval_CLIP.py		Eval_Retrieval_CLIP.py
Eval_Retrieval_TCL.py		Eval_Retrieval_TCL.py
README.md		README.md

sduzpf/UAP_VLP

Folders and files

Latest commit

History

Repository files navigation

Requirements

Prepare datasets and models

Learn universal adversarial perturbations

Evaluation

Image-Text Retrieval

Visual Grounding

Image Captioning

Citation

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages