|
1 |
| -DCVGAN: Depth Conditional Video Generation |
2 |
| --- |
3 |
| - |
4 |
| -This repository contains official pytorch implementation of DCVGAN. |
5 |
| - |
6 |
| -[Yuki Nakahira and Kazuhiko Kawamoto, DCVGAN: Depth Conditional Video Generation, 2019 IEEE International Conference on Image Processing, ICIP 2019.](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8803764) |
7 |
| - |
8 |
| - |
9 |
| - |
10 |
| -## About |
11 |
| - |
12 |
| -This paper proposes a new GAN architecture for video generation with depth videos and color videos. |
13 |
| - |
14 |
| -The proposed model explicitly uses the information of depth in a video sequence as additional information for a GAN-based video generation scheme to make the model understand scene dynamics more accurately. |
15 |
| - |
16 |
| -The model uses pairs of color video and depth video for training, and generates a video using the two steps. |
17 |
| - |
18 |
| -1. Generate the depth video to model the scene dynamics based on the geometrical information. |
19 |
| -2. To add appropriate color to the geometrical information of the scene, the domain translation from depth to color is performed for each image. |
20 |
| - |
21 |
| -The architecture is below. This model has three networks in the generator: frame seed generator ($R_M$), depth image generator ($G_D$), color image generator ($G_C$). |
22 |
| - |
23 |
| -<p align="center"> |
24 |
| -<img src="https://user-images.githubusercontent.com/13511520/57762583-baf5f300-773a-11e9-942d-858c2d834536.png" width="50%"> |
25 |
| -</p> |
26 |
| - |
27 |
| -In addition, the model has two discriminators: image discriminator ($D_I$) and video discriminator ($D_V$). The detailed network architecture is shown below. |
28 |
| - |
29 |
| -## Result |
30 |
| - |
31 |
| -#### facial expression dataset |
32 |
| - |
33 |
| -<p align="center"> |
34 |
| -<img src="https://user-images.githubusercontent.com/13511520/54088503-f58d8900-43a1-11e9-8b27-1eca5a7d8e98.gif" width="640px"> |
35 |
| -</p> |
36 |
| - |
37 |
| - |
38 |
| -#### hand gesture dataset |
39 |
| - |
40 |
| -<p align="center"> |
41 |
| -<img src="https://user-images.githubusercontent.com/13511520/54088434-75672380-43a1-11e9-9f7e-c6ff1bc0b77b.gif" width="640px"> |
42 |
| -</p> |
43 |
| - |
44 |
| - |
45 |
| -## Network Architecture |
46 |
| - |
47 |
| -### Generators |
48 |
| - |
49 |
| -<p align="center"> |
50 |
| -<img src="https://user-images.githubusercontent.com/13511520/57746277-743cd480-770b-11e9-8066-c3b6b64426aa.png" width="60%"> |
51 |
| -</p> |
52 |
| - |
53 |
| -### Discriminators |
54 |
| - |
55 |
| -<p align="center"> |
56 |
| -<img src="https://user-images.githubusercontent.com/13511520/57746276-73a43e00-770b-11e9-90ec-9dcc58ffc3b6.png" width="60%"> |
57 |
| -</p> |
58 |
| - |
59 |
| -## Usage |
60 |
| - |
61 |
| -### 1. Clone the repository |
62 |
| - |
63 |
| -```shell |
64 |
| -git clone https://github.com/raahii/dcvgan.git |
65 |
| -cd dcvgan |
66 |
| -``` |
67 |
| - |
68 |
| - |
69 |
| - |
70 |
| -### 2. Install dependencies |
71 |
| - |
72 |
| -#### Requirements |
73 |
| - |
74 |
| -- Python3.7 |
75 |
| -- PyTorch |
76 |
| -- FFmpeg |
77 |
| -- OpenCV |
78 |
| -- GraphViz |
79 |
| - |
80 |
| -#### Using docker |
81 |
| - |
82 |
| - Easy. Thanks :whale: |
83 |
| - |
84 |
| - ```shell |
85 |
| - docker build -t dcvgan -f docker/Dockerfile.gpu . |
86 |
| - docker run --runtime=nvidia -v $(pwd):/home/user/dcvgan -it dcvgan |
87 |
| - ``` |
88 |
| - |
89 |
| -#### Manual installation |
90 |
| - |
91 |
| - I recommend to use pyenv and conda to install dependencies. For instance, my environment is like following. |
92 |
| - |
93 |
| - ```shell |
94 |
| - pyenv install miniconda3-4.3.30 |
95 |
| - pyenv local miniconda3-4.3.30 |
96 |
| - conda install -y ffmpeg |
97 |
| - pip install -r requirements.txt |
98 |
| - ``` |
99 |
| - |
100 |
| - For detail, please refer my [Dockerfile](https://github.com/raahii/dcvgan/blob/master/docker/Dockerfile.gpu). |
101 |
| - |
102 |
| - |
103 |
| -### 2. Prepare the dataset |
104 |
| - |
105 |
| -- facial expression: [MUG Facial Exprssion Database](https://mug.ee.auth.gr/fed/) |
106 |
| -- hand gesture: [Chalearn LAP IsoGD Database](http://www.cbsr.ia.ac.cn/users/jwan/database/isogd.html) |
107 |
| - |
108 |
| -Please follow the instructions of each official page to obtain the dataset. Preprocessing codes for facial expression dataset is not available now. I recommend to place the dataset under `data/raw/` |
109 |
| - |
110 |
| - |
111 |
| - |
112 |
| -### 3. Training |
113 |
| - |
114 |
| -``` |
115 |
| -python src/train.py --config <config.yml> |
116 |
| -tensorboard --logdir <result dir> |
117 |
| -``` |
118 |
| - |
119 |
| -For the first time, preprocessing for the dataset starts automatically. Preprocessing is to format all datasets into a common format for [VideoDataset](https://github.com/raahii/dcvgan/blob/master/src/dataset.py#L18). |
120 |
| - |
121 |
| -Please refer and edit `configs/XXX.yml` to chage training configurations such as training epochs, batchsize, result directory. |
122 |
| - |
123 |
| -### 4. Sampling |
124 |
| - |
125 |
| -``` |
126 |
| -python src/generate_samples.py <result dir> <iteration> <save dir> |
127 |
| -``` |
128 |
| - |
129 |
| - |
130 |
| - |
131 |
| -### 5. Evaluation |
132 |
| - |
133 |
| -I have published a framework for efficient evaluation of video generation, [video-gans-evaluation](https://github.com/raahii/video-gans-evaluation). The framework supports `Inception Score`, `FID` and `PRD` now. **Please star the repository if you like it!** :relaxed: |
134 |
| - |
135 |
| - |
136 |
| - |
137 |
| ----- |
138 |
| - |
139 |
| - |
140 |
| - |
141 |
| -## TODOS |
142 |
| - |
143 |
| -- [ ] upload pretrained models |
144 |
| -- [x] dockernize |
145 |
| - |
| 1 | +[Note]: This repository contains my own experiment codes for my reseach porject, but once also was the official implementation of 'DCVGAN' presented in ICIP2019. If you come here by referring the conference, please go to [past version](https://github.com/raahii/dcvgan/tree/08564d00a251fd1f8fe5d1ce22893ee607d5f79a) and [release](https://github.com/raahii/dcvgan/releases/tag/icip2019). |
0 commit comments