Description
Environment info
adapters
version: 1.0.1transformers
version: 4.45.2- Platform: Linux-5.15.0-91-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.26.2
- Safetensors version: 0.4.5
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: No
- Using GPU in script?: Yes
- GPU type: NVIDIA RTX A6000
Information
Model I am using (Bert, XLNet ...): google/flan-t5-small
Language I am using the model on (English, Chinese ...): English
Adapter setup I am using (if any): LoRA
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
To reproduce
I have two environments:
- env1
adapter-transformers==3.1.0
torch==1.13.1
- env2
adapters==1.0.1
torch==2.5.1
transformers==4.45.2
I have some input_ids
input_ids = [262, 4, 4815, 10, 8668, 3, 5359, 27415, 5332, 3430, 276, 3577, 20186, 11951, 8472, 11359, 4209, 205, 20931, 23936, 3388, 27447, 8015]
I have a model checkpoint checkpoint.pth
which has the T5 weights plus a LoRA adapter and was saved in env1:
with open("checkpoint.pth"), "wb") as f:
torch.save(model.state_dict(), f)
From there I want to make sure I can load the model, run inference, and get the same outputs in env2. But the outputs are different. I run the following experiments:
- Create a T5 model, add the empty LoRA adapter, run inference - env1 and env2 get to the same output
- Create a T5 model, load the non-LoRA weights, run inference - env1 and env2 get to the same output
- Create a T5 model, add the LoRA adapter, load all the weights, run inference - env1 has the right output but env2 is different.
Here is the code I use (in env1 I just remove import adapters
, adapters.init(model)
, and use adapter_config = transformers.adapters.LoRAConfig(r=8, alpha=16)
:
import adapters
import torch
import transformers
input_ids = [262, 4, 4815, 10, 8668, 3, 5359, 27415, 5332, 3430, 276, 3577, 20186, 11951, 8472, 11359, 4209, 205, 20931, 23936, 3388, 27447, 8015]
model = transformers.AutoModel.from_pretrained("google/flan-t5-small")
adapters.init(model)
adapter_config = adapters.LoRAConfig(r=8, alpha=16)
model.add_adapter("ct", config=adapter_config)
model.set_active_adapters("ct")
model = model.encoder
checkpoint = torch.load("checkpoint.pth", map_location=torch.device("cpu"))
model.load_state_dict(checkpoint, strict=False)
model = model.eval()
outputs = model(input_ids=torch.IntTensor([input_ids]))
Unfortunately I can't share the model weights. Any thoughts on what might be the reason I get different outputs only if I use LoRA and load my weights?
Expected behavior
Getting the same output in env1 and env2.