Referring Image Matting [CVPR-2023]

This is the official repository of the paper Referring Image Matting.

Jizhizi Li, Jing Zhang, and Dacheng Tao

Introduction | RefMatte | CLIPMat | Results | Statement

🚀 News

[2023-04-17]: The datasets RefMatte and RefMatte-RW100 can now be openly accessed from the links below! Please follow the dataset release agreements to access.

Dataset Dataset Link (One Drive) Size Dataset Release Agreement

RefMatte Link (pw: 3ft9cb) 43.7G Agreement (CC BY-NC License)

RefMatte-RW100 Link (pw: 3ft9cb) 66.6M Agreement (CC BY-NC License)

[2023-02-28]: The paper has been accepted by the Computer Vision and Pattern Recognition Conference (CVPR)! 🎉

Introduction

Image matting refers to extracting the accurate foregrounds in the image. Current automatic methods tend to extract all the salient objects in the image indiscriminately. In this paper, we propose a new task named Referring Image Matting (RIM), referring to extracting the meticulous alpha matte of the specific object that can best match the given natural language description. We then propose a large-scale dataset RefMatte and a carefully designed method CLIPMat to serve as a baseline suite for RIM. We believe the new task RIM along with the RefMatte dataset and the method CLIPMat will open new research directions in this area and facilitate future studies. The dataset, code, and the method will be published soon.

RefMatte and RefMatte-RW100

Prevalent visual grounding methods are all limited to the segmentation level, probably due to the lack of high-quality datasets. To fill the gap, we establish the first large-scale challenging dataset RefMatte by designing a comprehensive image composition and expression generation engine to produce synthetic images on top of current public high-quality matting foregrounds with flexible logics and re-labelled diverse attributes. RefMatte consists of 230 object categories, 47,500 images, 118,749 expression-region entities, and 474,996 expressions, which can be further extended easily in the future. Besides this, we also construct a real-world test set RefMatte-RW100 with manually generated phrase annotations consisting of 100 natural images to further evaluate the generalization of RIM models. We show some examples of RefMatte as follows, including the images, the alpha mattes and the input texts. More can be seen from this page. We have released the dataset RefMatte and RefMatte-RW100, please follow the dataset release agreements to access.

Dataset	Dataset Link (One Drive)	Size	Dataset Release Agreement
RefMatte	Link (pw: 3ft9cb)	43.7G	Agreement (CC BY-NC License)
RefMatte-RW100	Link (pw: 3ft9cb)	66.6M	Agreement (CC BY-NC License)

We also generate the wordcloud of the keywords, attributes and relationships in RefMatte as belows. As can be seen, the dataset has a large portion of human and animals since they are very common in the image matting task. The most frequent attributes in RefMatte are male, gray, transparent, and salient, while the relationship words are more balanced.

CLIPMat

Furthermore, we present a novel baseline method CLIPMat for RIM, including a context-embedded prompt, a text-driven semantic pop-up, and a multi-level details extractor. Extensive experiments on RefMatte in both keyword and expression settings validate the superiority of CLIPMat over representative methods. We show the diagram as follows, while more information can be viewed from the paper.

Results

We show some examples of our test results on RefMatte test set and RefMatte-RW100 by our CLIPMat given text inputs and the images under both keyword-based and expression-based setting. More can be seen from this page.

Statement

If you are interested in our work, please consider citing the following:

@inproceedings{rim,
  title={Referring Image Matting},
  author={Li, Jizhizi and Zhang, Jing and Tao, Dacheng},
  booktitle={Proceedings of the IEEE Computer Vision and Pattern Recognition},
  year={2023}
}

This project is under the CC BY-NC license. For further questions, please contact Jizhizi Li at [email protected].

Relevant Projects

[1] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
Jizhizi Li, Jing Zhang, and Dacheng Tao

[2] Privacy-Preserving Portrait Matting, ACM MM, 2021 | Paper | Github
Jizhizi Li^∗, Sihan Ma^∗, Jing Zhang, and Dacheng Tao

[3] Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022 | Paper | Github
Jizhizi Li^∗, Jing Zhang^∗, Stephen J. Maybank, and Dacheng Tao

[4] Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023 | Paper | Github
Sihan Ma^∗, Jizhizi Li^∗, Jing Zhang, He Zhang, and Dacheng Tao

[5] Deep Image Matting: A Comprehensive Survey, ArXiv, 2023 | Paper | Github
Jizhizi Li, Jing Zhang, and Dacheng Tao

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
demo		demo
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demo

demo

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Referring Image Matting [CVPR-2023]

This is the official repository of the paper Referring Image Matting.

Jizhizi Li, Jing Zhang, and Dacheng Tao

🚀 News

Introduction

RefMatte and RefMatte-RW100

CLIPMat

Results

Statement

Relevant Projects

About

Releases

Packages

JizhiziLi/RIM

Folders and files

Latest commit

History

Repository files navigation

Referring Image Matting [CVPR-2023]

This is the official repository of the paper Referring Image Matting.

Jizhizi Li, Jing Zhang, and Dacheng Tao

🚀 News

Introduction

RefMatte and RefMatte-RW100

CLIPMat

Results

Statement

Relevant Projects

About

Topics

Resources

Stars

Watchers

Forks