Skip to content
#

vqav2

Here are 7 public repositories matching this topic...

Language: All
Filter by language

Pytorch implementation of VQA using Stacked Attention Networks: Multimodal architecture for image and question input, using CNN and LSTM, with stacked attention layer for improved accuracy (54.82%). Includes visualization of attention layers. Contributions welcome. Utilizes Visual VQA v2.0 dataset.

  • Updated Jan 18, 2023
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the vqav2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vqav2 topic, visit your repo's landing page and select "manage topics."

Learn more