Change the repository type filter
All
Repositories list
15 repositories
twigvlm
PublicImplementation of ICCV 2025 paper "Growing a Twig to Accelerate Large Vision-Language Models".prophet
PublicImplementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".imp
Publicmlc-imp
Publicanetqa
Public templateanetqa-code
Publicrosita
PublicROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integrationbst
Publicxmchat
Public- A PyTorch reimplementation of bottom-up-attention models
openvqa
PublicA lightweight, scalable, and general framework for visual question answering researchmcan-vqa
PublicDeep Modular Co-Attention Networks for Visual Question Answeringactivitynet-qa
Publicmmnas
Publicmt-captioning
PublicA PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning