Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Different mixup loss function between code and paper #18

Open
lukk47 opened this issue Apr 23, 2019 · 3 comments
Open

Different mixup loss function between code and paper #18

lukk47 opened this issue Apr 23, 2019 · 3 comments

Comments

@lukk47
Copy link

lukk47 commented Apr 23, 2019

The mixup loss function in code is as below:
image,
while the mixture should be down before feed into the loss function according to the paper.

Will these two loss functions have the same results?

@kleinzcy
Copy link

@LokLu It is the same.

image

So the two loss functions are the same.

Let me know if I am wrong.

@lukk47
Copy link
Author

lukk47 commented Dec 12, 2019

@kleinzcy Your equations are correct. But the problem is that the 'pred' in your code are the logits output of the model instead of the softmax of pred.

@lizc126
Copy link

lizc126 commented Nov 7, 2020

@kleinzcy Your equations are correct. But the problem is that the 'pred' in your code are the logits output of the model instead of the softmax of pred.

Hi, just saw this and I am curious as well. But the criterion should ideally take the logits output instead of softmax of pred right?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants