Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for lecun normal weight initialization #2290

Open
p-w-rs opened this issue Jul 15, 2023 · 11 comments
Open

Add support for lecun normal weight initialization #2290

p-w-rs opened this issue Jul 15, 2023 · 11 comments

Comments

@p-w-rs
Copy link

p-w-rs commented Jul 15, 2023

Motivation and description

Lecun normal initialization is needed (as far as I understand) to properly make self normalized neural networks.

Since Flux provides the selu activation function and alpha dropout, it would be nice to have lecun normal built in as well.

Possible Implementation

Draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor. (That is from tensorflow website)

@darsnack
Copy link
Member

Could probably use the existing truncated_normal with the standard deviation set according to the fan in.

@vortex73
Copy link

would be interested in solving this if its still open. Please elaborate if it is.

@darsnack
Copy link
Member

In src/utils.jl we already have a Flux.truncated_normal initializer function that accepts a custom standard deviation. A PR would add a new intializer, Flux.lecun_normal, that just calls the existing truncated_normal with a standard deviation calculated as mentioned above using Flux.nfan in the same src/utils.jl file.

@chiral-carbon
Copy link

Hi @vortex73, would you be working on this?

@darsnack
Copy link
Member

darsnack commented Aug 9, 2023

@chiral-carbon Please go ahead and open a PR if you are willing to tackle this.

@chiral-carbon
Copy link

@darsnack thanks, will open a PR soon

@chiral-carbon
Copy link

chiral-carbon commented Aug 14, 2023

@RohitRathore1 were you working on this? I had a PR in the works but will stop @darsnack

@Bhavay-2001
Copy link

Hi @chiral-carbon, @darsnack. Is this issue empty? Can I start?

@chiral-carbon
Copy link

@Bhavay-2001 i had claimed this but then a PR was opened soon after by someone else, so I’m not sure about the status. If this issue opens up again for a new PR I would like to work on it

@RohitRathore1
Copy link

Hi @chiral-carbon, I don't know. I have opened a PR but I have not got any comment on it and now I am checking the logs of one GitHub actions then logs are not available. I will have to review it again.

@ToucheSir
Copy link
Member

I don't recall why that PR didn't get comments. Maybe someone was waiting for tests to be in place? Anyhow, that would be my feedback now. We can continue on the PR thread :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants