Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codebook embedding does not update #14

Open
zhxgj opened this issue Apr 24, 2020 · 4 comments
Open

Codebook embedding does not update #14

zhxgj opened this issue Apr 24, 2020 · 4 comments

Comments

@zhxgj
Copy link

zhxgj commented Apr 24, 2020

I found ctx.needs_input_grad[1] is False during training VQ-VAE. Is this correct, and does it mean the embedding of the codebook does not update during training?

if ctx.needs_input_grad[1]:

@zhangbo2008
Copy link

i agree with you upfloor. it is so weird.

@chenaoxuan
Copy link

I found ctx.needs_input_grad[1] is False during training VQ-VAE. Is this correct, and does it mean the embedding of the codebook does not update during training?

if ctx.needs_input_grad[1]:

This part of code has not been executed! But I printed the "model.codebook.embedding.weight.data" and found that this part will be updated!

@Roller44
Copy link

Roller44 commented Jul 12, 2023

Actually, ctx.needs_input_grad[0] and ctx.needs_input_grad[1] are set to true and false alternatively.
For the 1st step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false.
For the 2nd step, ctx.needs_input_grad[0] becomes false and ctx.needs_input_grad[1] becomes true.
For the 3rd step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false.
So on and so forth...

This setting is reasonable because there are two "agents", namely codebook and autoencoder, updating w.r.t. to different parts of the loss function.

@RipeMangoBox
Copy link

RipeMangoBox commented May 10, 2024

Actually, ctx.needs_input_grad[0] and ctx.needs_input_grad[1] are set to true and false alternatively. For the 1st step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false. For the 2nd step, ctx.needs_input_grad[0] becomes false and ctx.needs_input_grad[1] becomes true. For the 3rd step, ctx.needs_input_grad[0] is true and ctx.needs_input_grad[1] is false. So on and so forth...

This setting is reasonable because there are two "agents", namely codebook and autoencoder, updating w.r.t. to different parts of the loss function.

I debug the code and find that ctx.needs_input_grad[1] is always false rather than being set to true and false alternatively.
A basic fact is that if a variable $A$ doesn't require gradient, it doesn't mean that it will not be updated during optimization. The attirbute requires_grad describes whether its gradient should be calculated. In other word, whether other variables calculated by $A$ should be updated rather than updating $A$ itself!

Therefore, though ctx.needs_input_grad[1].requires_grad is always False, the codebook can still be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants