Skip to content

question of bbox in VQA feature #61

@1144181135

Description

@1144181135

Hi VilBert team, I read your code of VQA dataset: vilbert/datasets/vqa_dataset.py. And I see the bbox is shape of "N x 5". I have some questions:

  1. what's the meaning of dimension 5? A bbox information maybe [x1, y1, x2, y2] with dimension 4. What's the last dimension mean? Classification score?
  2. Does the bbox coordinations [x1, y1, x2, y2] need to be normalized to 0~1 ?
    That's all. Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions