-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kd_loss implementation issue #74
Comments
I remeber that .mean(1) is equal to reduction='batch_mean‘ ? |
Here is the source code of And the problem here is that the |
So batch_mean equals .mean(0)? |
No. "batchmean" means .sum()/batch_size, i.e., .sum(1).mean() |
OK, I get your point, you mean mathmatically .sum(1) is the correct implementation and .mean(1)=.sum(1)/16 |
BTW, I also found that |
It's a intended behavior because experiment shows not dividing is better. Don't know the theory behind this though |
I see, thanks for the reply. |
Hello, I found that the
knowledge_distillation_kl_div_loss()
inmmdet/models/losses/kd_loss.py
uses a different implementation compared to the normal KL Div definition, which is equivalent toF.kl_div(reduction='mean')
instead ofF.kl_div(reduction='batchmean')
as mentioned in F.kl_div.The correct KL Div should be like
Is there any reason to use the above implementation? Current kl_div is 1/17 smaller than the real kl_div, when gfl reg_max=16.
The text was updated successfully, but these errors were encountered: