[Paper
] [Code
] [Project Page
]
Vision-Language Models (VLMs) are becoming increasingly popular across various visual tasks, and several open-sourced VLM variants have been released. However, selecting the best-performing VLM for a specific downstream task is challenging since no single VLM can achieve promising performance on all downstream tasks, and evaluating all available VLMs is impossible due to time and data limitations. To address this problem, we proposes a novel paradigm to select and reuse VLM for downstream adaptation, called Model Label Learning (MLL).
MLL contains three key modules: model labeling, which assigns labels to each
VLM to describe their specialty and utility; model selection, which matches the requirements of the target task with model labels; and model reuse, which applies selected VLMs to the target task in an ensemble manner. The proposal is highly computationally efficient and growable. We also introduce a new benchmark for evaluating VLM selection methods, including 49 VLMs and 17 target task datasets.
- [2025-05] Our paper has been accepted by ICML 2025!
- [2025-05] Our code for MLL is released now.
- [2025-01] Our paper is accessible now.
Prepare your conda environment with environment.yml
.
$ conda env create -f environment.yml
We have listed all needed datasets in the json file ./MLL/dataset.json
. You can download them from source websites of each datasets and put them to specific paths as shown in the attribute 'root'
of each dataset. Here are the download links of them.
You can download evaluation datasets evaluation_datasets.tar.gz
from here and unzip it to ./evaluation_dataset
. You can also construct it from scratch by runing the code.
$ python ./MLL/evaluation_dataset_construct.py
$ python ./MLL/label.py
$ python selection.py
$ python reuse.py
The results will be stored in ./MLL/res/reuse
If you find our work useful, please consider citing:
@article{tan2025vision,
title={Vision-Language Model Selection and Reuse for Downstream Adaptation},
author={Tan, Hao-Zhe and Zhou, Zhi and Li, Yu-Feng and Guo, Lan-Zhe},
journal={arXiv preprint arXiv:2501.18271},
year={2025}
}