-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different outputs when upgrading from adapter-transformers with LoRA #760
Comments
After diving into the codebase, I think I understand the difference. This looks like a bug in the
Am I missing something? |
Hey @jblamare, Thank you so much for reporting this and for providing the detailed investigation! I was able to reproduce the issues and believe your suspicion that this is a bug in our current implementation is correct. The application of default scaling unfortunately got lost in a recent larger refactoring of the lora code. I think the proper way to re-add it would be in the LoRA module forward, like this: calpt@95be3cf. This removed the output diff between adapter-transformers & adapters for me, you might want to check on your side as well. I'll work on patching it in the main code. Thanks again for bringing this up! |
Hi @calpt, thanks a lot for looking into this! I can confirm that I've tested your change and it works for me. I'll look out for the next release! |
Resolves issue described in #760. **IMPORTANT**: this fix restores weights compatibility with adapter-transformers. Compatibility to previous adapters versions is kept via a compat patch. ## Details The current implementation of LoRA/ (IA)^3 in `adapters ` versions < 1.1.0 does not correctly implement adapter states scaling via the LoRA `alpha` attribute, effectively ignoring `alpha` and always applying a scaling of 1.0. This PR restores the correct original behavior as found in adapter-transformers/ original LoRA implementation. As this change breaks all adapters pre-trained using `adapters` versions 0.1.0 - 1.0.1, a backward compatibility patch is introduced that automatically sets `alpha = r` for LoRAs for adapters that were trained using affected versions. This ensures all previous adapters continue to behave exactly as trained (ie give the exact same output using newer versions). --------- Co-authored-by: TimoImhof <[email protected]>
Environment info
adapters
version: 1.0.1transformers
version: 4.45.2Information
Model I am using (Bert, XLNet ...):
google/flan-t5-small
Language I am using the model on (English, Chinese ...): English
Adapter setup I am using (if any): LoRA
The problem arises when using:
The tasks I am working on is:
To reproduce
I have two environments:
I have some input_ids
I have a model checkpoint
checkpoint.pth
which has the T5 weights plus a LoRA adapter and was saved in env1:From there I want to make sure I can load the model, run inference, and get the same outputs in env2. But the outputs are different. I run the following experiments:
Here is the code I use (in env1 I just remove
import adapters
,adapters.init(model)
, and useadapter_config = transformers.adapters.LoRAConfig(r=8, alpha=16)
:Unfortunately I can't share the model weights. Any thoughts on what might be the reason I get different outputs only if I use LoRA and load my weights?
Expected behavior
Getting the same output in env1 and env2.
The text was updated successfully, but these errors were encountered: