Skip to content

[ECCV 2024 Best Paper Candidate] Implementation of "Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention"

License

Notifications You must be signed in to change notification settings

gpt4vision/OvSGTR

Repository files navigation

Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention

Paper

Official Implementation of
"Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention"
🏆 Recognized as "Best Paper Candidate" at ECCV 2024 (Milan, Italy)


OvSGG OvSGTR


📰 News

  • 2025.05: Release the dataset MegaSG introduced in Scene-Bench
  • 2025.02: Add checkpoints for the TPAMI version
  • 2024.10: Our paper has been recognized as "Best Paper Candidate" (Milan, Italy, ECCV 2024)

🛠️ Setup

For simplicity, you can directly run:

bash install.sh

which includes the following steps:

  1. Install PyTorch 1.9.1 and other dependencies:
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

(Adjust CUDA version if necessary.)

  1. Install GroundingDINO and download pretrained weights:
cd GroundingDINO && python3 setup.py install
mkdir $PWD/GroundingDINO/weights/
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth -O $PWD/GroundingDINO/weights/groundingdino_swint_ogc.pth
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth -O $PWD/GroundingDINO/weights/groundingdino_swinb_cogcoor.pth

📚 Dataset

Supported datasets:

  • VG150
  • COCO

Prepare the dataset under data/ folder following the instruction.


📈 Closed-set SGG

Training

bash scripts/DINO_train_dist.sh vg ./config/GroundingDINO_SwinT_OGC_full.py ./data ./logs/ovsgtr_vg_swint_full ./GroundingDINO/weights/groundingdino_swint_ogc.pth

or using Swin-B:

bash scripts/DINO_train_dist.sh vg ./config/GroundingDINO_SwinB_full.py ./data ./logs/ovsgtr_vg_swinb_full ./GroundingDINO/weights/groundingdino_swinb_cogcoor.pth

Adjust CUDA_VISIBLE_DEVICES if needed. Effective batch size = batch size × number of GPUs.

Inference

bash scripts/DINO_eval.sh vg [config file] [data path] [output path] [checkpoint]

or

bash scripts/DINO_eval_dist.sh vg [config file] [data path] [output path] [checkpoint]

Benchmark on Closed-set SGG


📥 Checkpoints (Closed-set SGG)

Backbone R@20/50/100 Checkpoint Config
Swin-T 26.97 / 35.82 / 41.38 link config/GroundingDINO_SwinT_OGC_full.py
Swin-T (pretrained on MegaSG) 27.34 / 36.27 / 41.95 link config/GroundingDINO_SwinT_OGC_full.py
Swin-B 27.75 / 36.44 / 42.35 link config/GroundingDINO_SwinB_full.py
Swin-B (w/o freq bias & focal loss) 27.53 / 36.18 / 41.79 link config/GroundingDINO_SwinB_full_open.py
Swin-B (pretrained on MegaSG) 28.61 / 37.58 / 43.41 link config/GroundingDINO_SwinB_full_open.py

🚀 OvD-SGG (Open-vocabulary Detection SGG)

Set:

sg_ovd_mode = True

📥 Checkpoints (OvD-SGG)

Backbone R@20/50/100 (Base+Novel) R@20/50/100 (Novel) Checkpoint Config
Swin-T 12.34 / 18.14 / 23.20 6.90 / 12.06 / 16.49 link config/GroundingDINO_SwinT_OGC_ovd.py
Swin-B 15.43 / 21.35 / 26.22 10.21 / 15.58 / 19.96 link config/GroundingDINO_SwinB_ovd.py
Swin-T (pretrained on MegaSG) 14.33 / 20.91 / 25.98 10.52 / 17.30 / 22.90 link config/GroundingDINO_SwinT_OGC_ovd.py
Swin-B (pretrained on MegaSG) 15.21 / 21.21 / 26.12 10.31 / 15.78 / 20.47 link config/GroundingDINO_SwinB_ovd.py


🔥 OvR-SGG (Open-vocabulary Relation SGG)

Set:

sg_ovr_mode = True

📥 Checkpoints (OvR-SGG)

Backbone R@20/50/100 (Base+Novel) R@20/50/100 (Novel) Checkpoint Config Pre-trained Checkpoint Pre-trained Config
Swin-T 15.85 / 20.50 / 23.90 10.17 / 13.47 / 16.20 link config/GroundingDINO_SwinT_OGC_ovr.py link config/GroundingDINO_SwinT_OGC_pretrain.py
Swin-B 17.63 / 22.90 / 26.68 12.09 / 16.37 / 19.73 link config/GroundingDINO_SwinB_ovr.py link config/GroundingDINO_SwinB_pretrain.py
Swin-T (pretrained on MegaSG) 19.38 / 25.40 / 29.71 12.23 / 17.02 / 21.15 link config/GroundingDINO_SwinT_OGC_ovr.py link config/GroundingDINO_SwinT_OGC_pretrain.py
Swin-B (pretrained on MegaSG) 21.09 / 27.92 / 32.74 16.59 / 22.86 / 27.73 link config/GroundingDINO_SwinB_ovr.py link config/GroundingDINO_SwinB_pretrain.py

🌟 OvD+R-SGG (Joint Open-vocabulary SGG)

Set:

sg_ovd_mode = True
sg_ovr_mode = True

📥 Checkpoints (OvD+R-SGG)

Backbone R@20/50/100 (Joint) R@20/50/100 (Novel Object) R@20/50/100 (Novel Relation) Checkpoint Config Pre-trained Checkpoint Pre-trained Config
Swin-T 10.02 / 13.50 / 16.37 10.56 / 14.32 / 17.48 7.09 / 9.19 / 11.18 link config/GroundingDINO_SwinT_OGC_ovdr.py link config/GroundingDINO_SwinT_OGC_pretrain.py
Swin-B 12.37 / 17.14 / 21.03 12.63 / 17.58 / 21.70 10.56 / 14.62 / 18.22 link config/GroundingDINO_SwinB_ovdr.py link config/GroundingDINO_SwinB_pretrain.py
Swin-T (pretrained on MegaSG) 10.67 / 15.15 / 18.82 8.22 / 12.49 / 16.29 9.62 / 13.68 / 17.19 link config/GroundingDINO_SwinT_OGC_ovdr.py link config/GroundingDINO_SwinT_OGC_pretrain.py
Swin-B (pretrained on MegaSG) 12.54 / 17.84 / 21.95 10.29 / 15.66 / 19.84 12.21 / 17.15 / 21.05 link config/GroundingDINO_SwinB_ovdr.py link config/GroundingDINO_SwinB_pretrain.py

🤝 Acknowledgement

We thank:

for their awesome open-source codes and models.


📖 Citation

If you find our work helpful, please cite:

@inproceedings{chen2024expanding,
  title={Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention},
  author={Chen, Zuyao and Wu, Jinlin and Lei, Zhen and Zhang, Zhaoxiang and Chen, Changwen},
  booktitle={European Conference on Computer Vision (ECCV)},
  pages={108--124},
  year={2024}
}

✨ Enjoy Exploring Open-Vocabulary Scene Graph Generation!

About

[ECCV 2024 Best Paper Candidate] Implementation of "Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published