-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
checkpoint's size is increasing everytime. #134
Comments
is it possible to share weights and state with me? so i can debug that and fix issue, anyway that's the first time i see an issue like that i have trained a |
Sure. I sent you them by email. Please check it. Thank you so much! |
this issue might be fixed do to recent changes and bug fixes in past days in fjformer |
Describe the bug
Hi, when I'm finetuning gemma. the checkpoint size was a fixed value at the begining. Then it became bigger and bigger. Finally, when it reached 5.99GB, it can still continue finetuning, but cannot save any new checkpoint and raised error
ValueError: unicode string is too large
.To Reproduce
Steps to reproduce the behavior
The text was updated successfully, but these errors were encountered: