Yian Li1,Β
Wentao Tian1,Β
Yang Jiao1,Β
Jingjing Chen1 β,Β
Tianwen Qian2,Β
Bin Zhu3,Β
Na Zhao4,Β
Yu-Gang Jiang1
1Shanghai Key Lab of Intell. Info. Processing, School of CS, Fudan University
2East China Normal University
β
3Singapore Management University
4Singapore University of Technology and Design
β Corresponding Author
- [2025-06-02] Qwen-AD paper is released on the arxiv and GitHub repo is created.
Recently, Multimodal Large Language Models (MLLMs) have achieved significant success across multiple disciplines due to their exceptional instruction-following capabilities and extensive world knowledge. However, whether these MLLMs possess human-like compositional reasoning abilities remains an open problem. To unveil their reasoning behaviors, we first curate a Multimodal Assumptive Reasoning Benchmark (MARS-Bench) in this paper. Interestingly, we find that most prevalent MLLMs can be easily fooled by the introduction of a presupposition into the question, whereas such presuppositions appear naive to human reasoning. Besides, we also propose a simple yet effective method, Active Deduction (AD), a novel reinforcement learning paradigm to encourage the model to actively perform composite deduction before reaching a final decision. Equipped with the proposed AD method, a MLLM demonstrates significant improvements in assumptive reasoning abilities without compromising its general-purpose question-answering performance. We also provide extensive evaluations of both open-source and private MLLMs on MARS-Bench, along with experimental analyses of the AD method.
Coming soon.
Coming soon.
Coming soon.
Coming soon.
Coming soon.
If you find this project useful in your research, please consider citing:
@article{li2024look,
title={Look before you decide: Prompting active deduction of mllms for assumptive reasoning},
author={Li, Yian and Tian, Wentao and Jiao, Yang and Chen, Jingjing and Qian, Tianwen and Zhu, Bin and Zhao, Na and Jiang, Yu-Gang},
journal={arXiv preprint arXiv:2404.12966},
year={2024}
}
We are immensely grateful to the ms-swift, Qwen2.5-VL and VLMEvalKit projects for the inception of this repository. Thanks for their great work! π