Skip to content

Commit e76da8a

Browse files
authored
Update README.md
1 parent 56bddc5 commit e76da8a

File tree

1 file changed

+30
-28
lines changed

1 file changed

+30
-28
lines changed

README.md

Lines changed: 30 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
This reposity mainly talked about classical GANs which focus on stabilizing training process and generating high quality images.
2-
*********************
3-
41
All have been tested with python2.7+ and tensorflow1.0+ in linux.
52

63
* Samples: save generated data, each folder contains a figure to show the results.
@@ -9,24 +6,24 @@ All have been tested with python2.7+ and tensorflow1.0+ in linux.
96
* nets.py: Generator and Discriminator are saved here.
107

118

12-
For research purpose,
9+
For research purpose,
1310
**Network architecture**: all GANs used the same network architecture(the Discriminator of EBGAN and BEGAN are the combination of traditional D and G)
14-
**Learning rate**: all initialized by 1e-4 and decayed by a factor of 2 each 5000 epoches (Maybe it is unfair for some GANs, but the influences are small so I ignored)
11+
**Learning rate**: all initialized by 1e-4 and decayed by a factor of 2 each 5000 epoches (Maybe it is unfair for some GANs, but the influences are small, so I ignored)
1512
**Dataset**: celebA cropped with 128 and resized to 64, users should copy all celebA images to `./Datas/celebA` for training
1613

17-
- [x] DCGAN
18-
- [x] EBGAN
19-
- [x] WGAN
20-
- [x] BEGAN
14+
- [x] DCGAN
15+
- [x] EBGAN
16+
- [x] WGAN
17+
- [x] BEGAN
2118
And for comparsion, I added VAE here.
22-
- [x] VAE
19+
- [x] VAE
2320

2421
The generated results are shown in the end of this page.
2522

2623
***************
2724

2825

29-
# Generative Models
26+
# Theories
3027

3128
:sparkles:DCGAN
3229
--------
@@ -58,7 +55,8 @@ What is energy function?
5855
![EBGAN_structure](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/Energy_based_model.png)
5956
The figure is from [LeCun, Yann, et al. "A tutorial on energy-based learning." ](http://yann.lecun.com/exdb/publis/pdf/lecun-06.pdf)
6057

61-
In EBGAN, we want the Discriminator to distinguish the real images and the generated(fake) images. How? A simple idea is to set X as the real image and Y as the reconstructed image, and then minimize the energy of X and Y. So we need a auto-encoder to get Y from X, and a measure to calcuate the energy (here are MSE, so simple). Finally we get the structure of Discriminator as shown below.
58+
In EBGAN, we want the Discriminator to distinguish the real images and the generated(fake) images. How? A simple idea is to set X as the real image and Y as the reconstructed image, and then minimize the energy of X and Y. So we need a auto-encoder to get Y from X, and a measure to calcuate the energy (here are MSE, so simple).
59+
Finally we get the structure of Discriminator as shown below.
6260

6361
![EBGAN_structure](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/EBGAN_structure.png)
6462

@@ -88,8 +86,8 @@ Use EM distance or Wasserstein-1 distance, so GAN can solve the two problems abo
8886
**Mathmatics Analysis**
8987
Why JS divergence has problems? pleas see [Towards Principled Methods for Training Generative Adversarial Networks](https://arxiv.org/pdf/1701.04862.pdf)
9088

91-
Anyway, this highlights the fact that the KL, JS, and TV distances are not sensible
92-
cost functions when learning distributions supported by low dimensional manifolds.
89+
Anyway, this highlights the fact that **the KL, JS, and TV distances are not sensible
90+
cost functions** when learning distributions supported by low dimensional manifolds.
9391

9492
so the author use Wasserstein distance
9593
![WGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/WGAN_loss1.png)
@@ -125,51 +123,55 @@ However, it is difficult to directly calculate the original formula, ||f||_L<=1
125123
We have already introduced the structure of EBGAN, which is also used in BEGAN.
126124
Then, instead of calculating the Wasserstein distance of the samples distribution in WGAN, BEGAN calculates the wasserstein distance of loss distribution.
127125
(The mathematical analysis in BEGAN I think is more clear and intuitive than in WGAN)
128-
So, simply replace the E of L, we get the loss function:
126+
So, simply replace the E of L, we get the loss function:
129127
![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss1.png)
130128

131129
Then, the most intereting part is comming:
132130
a new hyper-paramer to control the trade-off between image diversity and visual quality.
133131
![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss2.png)
134132
Lower values of γ lead to lower image diversity because the discriminator focuses more heavily on auto-encoding real images.
135133

136-
The final loss function is:
134+
The final loss function is:
137135
**Loss Function**
138136
![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss3.png)
139137

140138
The intuition behind the function is easy to understand:
141139
(Here I describe my understanding roughly...)
142140
(1). In the beginning, the G and D are initialized randomly and k_0 = 0, so the L_real is larger than L_fake, leading to a short increase of k.
143-
(2). After several iterations, the D easily learned how to reconstruct the real data, so gamma*L_real - L_fake is negative, k decreased to 0, now D is only to reconstruct the real data and G is to learn real data distrubition so as to minimize the reconstruction error in D.
144-
(3). Along with the improvement of the ability of G to generate images like real data, L_fake becomes smaller and k becomes larger, so D focuses more on discriminating the real and fake data, then G trained more following.
145-
(4). In the end, k becomes a constant, which means gamma*L_real - L_fake=0, so the optimization is done.
141+
(2). After several iterations, the D easily learned how to reconstruct the real data, so gamma x L_real - L_fake is negative, k decreased to 0, now D is only to reconstruct the real data and G is to learn real data distrubition so as to minimize the reconstruction error in D.
142+
(3). Along with the improvement of the ability of G to generate images like real data, L_fake becomes smaller and k becomes larger, so D focuses more on discriminating the real and fake data, then G trained more following.
143+
(4). In the end, k becomes a constant, which means gamma x L_real - L_fake=0, so the optimization is done.
146144

147145

148-
149-
And the global loss is defined the addition of L_real (how well D learns the distribution of real data) and |gamma*L_real - L_fake| (how closed of the generated data from G and the real data)
146+
And the global loss is defined the addition of L_real (how well D learns the distribution of real data) and |gamma*L_real - L_fake| (how closed of the generated data from G and the real data)
150147
![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_loss4.png)
151148

152149

153-
I set gamma=0.75, learning rate of k = 0.001, the learning curve of loss and k is shown below.
150+
I set gamma=0.75, learning rate of k = 0.001, then the learning curve of loss and k is shown below.
154151
![BEGAN_loss](https://raw.githubusercontent.com/YadiraF/Images/master/GAN/BEGAN_curve.png)
155152

156153

157154

158155
# Results
159156

160157
DCGAN
161-
![DCGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/dcgan/497.png)
158+
![DCGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/dcgan/497.png)
159+
162160
EBGAN (not trained enough)
163-
![EBGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/ebgan/109_r.png)
161+
![EBGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/ebgan/109_r.png)
162+
164163
WGAN (not trained enough)
165-
![WGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/wgan/260.png)
164+
![WGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/wgan/260.png)
165+
166166
BEGAN: gamma=0.75 learning rate of k=0.001
167-
![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/began_n/369_r.png)
167+
![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/began_n/369_r.png)
168+
168169
BEGAN: gamma= 0.5 learning rate of k = 0.002
169-
![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/began/228_r.png)
170+
![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/began/228_r.png)
170171

171172
VAE
172-
![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/vae/499_s.png)
173+
![BEGAN_samples](https://raw.githubusercontent.com/YadiraF/GAN_Theories/master/Samples/vae/499_s.png)
174+
173175

174176
# References
175177
http://wiseodd.github.io/techblog/2016/12/10/variational-autoencoder/ (a good blog to introduce VAE)

0 commit comments

Comments
 (0)