Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of a new symbol without complete retraining #52

Open
divyansh2111 opened this issue Nov 21, 2022 · 1 comment
Open

Addition of a new symbol without complete retraining #52

divyansh2111 opened this issue Nov 21, 2022 · 1 comment

Comments

@divyansh2111
Copy link

Is it possible to add a few symbols in the charset and then finetune with a smaller dataset comprising of these new symbols starting with pretrained weights?

@baudm
Copy link
Owner

baudm commented Nov 22, 2022

It should be possible, but you won't be able to use the built-in finetuning code since the output shape will change. The process should look something like this:

  1. Append the additional symbols to the end of the charset (or you could just define a new one with the additional symbols). Make sure to update the charset inside test.py too if you're planning on using that. Also, do check the charset used during validation and update that if necessary.
  2. Manually load the weights for all layers except for the output head (possibly inside train.py just before the training loop)
  3. Use a low learning rate (something like 1e-4 to 1e-3)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants