What's your training method of BNN #18

MJChku · 2021-01-28T12:34:47Z

Hi Guys,
I don't understand how do you train your binary weights, if you use sign function, the gradient is broken at the sign function. And I didn't get how do you handle this (in XNor-Net paper, they are keeping real weights for update) by looking at your source code. Could you point me to where you handled this? Thanks!

Also, it seems the quantization is unstable. I implemented a flow model based on your code and quantization is unstable comparing to my pytorch implementation. And I think the reason is in how you handled gradient update.

Jopyth · 2021-01-28T12:39:41Z

Hi,

you can check the implementation of our det_sign function here

It uses the identity function (STE) in the backward pass.

You can check the implementation of a simple binary convolution here, it stores weights as 32-bit values during training.

Best regards,
Joseph

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's your training method of BNN #18

What's your training method of BNN #18

MJChku commented Jan 28, 2021

Jopyth commented Jan 28, 2021

What's your training method of BNN #18

What's your training method of BNN #18

Comments

MJChku commented Jan 28, 2021

Jopyth commented Jan 28, 2021