-
Notifications
You must be signed in to change notification settings - Fork 94
Open
Description
Hi VilBert team, I read your code of VQA dataset: vilbert/datasets/vqa_dataset.py. And I see the bbox is shape of "N x 5". I have some questions:
- what's the meaning of dimension 5? A bbox information maybe [x1, y1, x2, y2] with dimension 4. What's the last dimension mean? Classification score?
- Does the bbox coordinations [x1, y1, x2, y2] need to be normalized to 0~1 ?
That's all. Thank you.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels