Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with validation training and e-prop #80

Open
neworderofjamie opened this issue Oct 14, 2023 · 3 comments
Open

Issues with validation training and e-prop #80

neworderofjamie opened this issue Oct 14, 2023 · 3 comments
Labels
bug Something isn't working e-prop
Milestone

Comments

@neworderofjamie
Copy link
Contributor

neworderofjamie commented Oct 14, 2023

I clearly wasn't thinking about e-prop when I added support for training with validation in #57 as currently gradients are being accumulated e.g. at https://github.com/genn-team/ml_genn/blob/master/ml_genn/ml_genn/compilers/eprop_compiler.py#L209 during validation and therefore will get applied at the end of the first training batch. In e-prop, the state needs to be fully reset (especially gradients) at the start of each training epoch. I think the cleanest solution is to fully reset state at the beginning of each training epoch. That way eligibility trace/adaptation variable/gradients accumulated during validation won't effect training but model will be fully adapted after a full epoch of training so validation is a bit more realistic

@neworderofjamie neworderofjamie added the bug Something isn't working label Oct 14, 2023
@neworderofjamie neworderofjamie added this to the mlGeNN 2.2 milestone Oct 14, 2023
@tnowotny
Copy link
Member

I still don't understand how to handle adaptation variables though. Resetting makes sense in some ways but it's sub-optimal if our intuition about a "working regime" away from reset values is right?

@neworderofjamie
Copy link
Contributor Author

But incorporating more information about that regime from the validation data into the next training epoch seems like cheating

@tnowotny
Copy link
Member

Yes, it could lead to overestimating validation performance. It would not mean actual overstating of accuracy as long as the test set stays separate. However, how to handle testing on the test set?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working e-prop
Projects
None yet
Development

No branches or pull requests

2 participants