Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's your training method of BNN #18

Open
MJChku opened this issue Jan 28, 2021 · 1 comment
Open

What's your training method of BNN #18

MJChku opened this issue Jan 28, 2021 · 1 comment

Comments

@MJChku
Copy link

MJChku commented Jan 28, 2021

Hi Guys,
I don't understand how do you train your binary weights, if you use sign function, the gradient is broken at the sign function. And I didn't get how do you handle this (in XNor-Net paper, they are keeping real weights for update) by looking at your source code. Could you point me to where you handled this? Thanks!

Also, it seems the quantization is unstable. I implemented a flow model based on your code and quantization is unstable comparing to my pytorch implementation. And I think the reason is in how you handled gradient update.

@Jopyth
Copy link
Collaborator

Jopyth commented Jan 28, 2021

Hi,

you can check the implementation of our det_sign function here

It uses the identity function (STE) in the backward pass.

You can check the implementation of a simple binary convolution here, it stores weights as 32-bit values during training.

Best regards,
Joseph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants