Skip to content

Commit 42303c3

Browse files
committed
update readme
1 parent 5819600 commit 42303c3

20 files changed

+61
-524
lines changed

README.md

Lines changed: 61 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,61 @@
1-
# FQGAN
2-
This is the project page of FQGAN
1+
<div align="center">
2+
<br>
3+
<h3>Factorized Visual Tokenization and Generation</h3>
4+
5+
[Zechen Bai](https://www.baizechen.site/) <sup>1</sup>&nbsp;
6+
[Jianxiong Gao](https://jianxgao.github.io/) <sup>2</sup>&nbsp;
7+
[Ziteng Gao](https://sebgao.github.io/) <sup>1</sup>&nbsp;
8+
[Pichao Wang](https://wangpichao.github.io/) <sup>3</sup>&nbsp;
9+
[Zheng Zhang](https://scholar.google.com/citations?user=k0KiE4wAAAAJ&hl=en) <sup>3</sup>&nbsp;
10+
[Tong He](https://hetong007.github.io/) <sup>3</sup>&nbsp;
11+
[Mike Zheng Shou](https://sites.google.com/view/showlab) <sup>1</sup>&nbsp;
12+
13+
arXiv 2024
14+
15+
<sup>1</sup> [Show Lab, National University of Singapore](https://sites.google.com/view/showlab/home) &nbsp; <sup>2</sup> Fudan University&nbsp; <sup>3</sup> Amazon&nbsp;
16+
17+
[![arXiv](https://img.shields.io/badge/arXiv-<2409.19603>-<COLOR>.svg)](https://arxiv.org/abs/2411.16681)
18+
19+
</div>
20+
21+
**News**
22+
* **[2024-11-28]** The code and model will be released soon after internal approval!
23+
* **[2024-11-26]** We released our paper on [arXiv](https://arxiv.org/abs/2411.16681).
24+
25+
## TL;DR
26+
FQGAN is state-of-the-art visual tokenizer with a novel factorized tokenization design, surpassing VQ and LFQ methods in discrete image reconstruction.
27+
28+
<p align="center"> <img src="assets/rfid_teaser.jpg" width="555"></p>
29+
30+
## Method Overview
31+
32+
FQGAN addresses the large codebook usage issue by decomposing a single large codebook into multiple independent sub-codebooks.
33+
By leveraging disentanglement regularization and representation learning objectives, the sub-codebooks learn hierarchical, structured and semantic meaningful representations.
34+
FQGAN achieves state-of-the-art performance on discrete image reconstruction, surpassing VQ and LFQ methods.
35+
36+
<p align="center"> <img src="assets/framework.jpg" width="888"></p>
37+
38+
39+
## Comparison with previous visual tokenizers
40+
<p align="center"> <img src="assets/Tab_Tok.png" width="666"></p>
41+
42+
## What has each sub-codebook learned?
43+
<p align="center"> <img src="assets/tsne_dual_codebook.jpg" width="666"></p>
44+
45+
<p align="center"> <img src="assets/recon_codebook.jpg" width="666"></p>
46+
47+
## Can this tokenizer be used into downstream image generation?
48+
49+
<p align="center"> <img src="assets/Tab_AR.png" width="666"></p>
50+
<p align="center"> <img src="assets/AR_gen.jpg" width="888"></p>
51+
52+
## Citation
53+
To cite the paper and model, please use the below:
54+
```
55+
@article{bai2024factorized,
56+
title={Factorized Visual Tokenization and Generation},
57+
author={Bai, Zechen and Gao, Jianxiong and Gao, Ziteng and Wang, Pichao and Zhang, Zheng and He, Tong and Shou, Mike Zheng},
58+
journal={arXiv preprint arXiv:2411.16681},
59+
year={2024}
60+
}
61+
```
File renamed without changes.
File renamed without changes.
File renamed without changes.

assets/brand/bootstrap-logo-white.svg

Lines changed: 0 additions & 1 deletion
This file was deleted.

assets/brand/bootstrap-logo.svg

Lines changed: 0 additions & 1 deletion
This file was deleted.

assets/cover.css

Lines changed: 0 additions & 124 deletions
This file was deleted.

assets/dist/css/bootstrap.min.css

Lines changed: 0 additions & 6 deletions
This file was deleted.

assets/dist/css/bootstrap.min.css.map

Lines changed: 0 additions & 1 deletion
This file was deleted.

assets/dist/css/bootstrap.rtl.min.css

Lines changed: 0 additions & 6 deletions
This file was deleted.

0 commit comments

Comments
 (0)