You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Stacktrace (adapters-initialized matrices have incompatible tensor dtype with model)
Traceback (most recent call last):
File "/mnt/data/alex/lora_on_reft/adapters_issue.py", line 70, in<module>trainer.train()
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/transformers/trainer.py", line 2052, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/transformers/trainer.py", line 2388, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/transformers/trainer.py", line 3485, in training_step
loss = self.compute_loss(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/transformers/trainer.py", line 3532, in compute_loss
outputs = model(**inputs)
^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1189, in forward
outputs = self.model(
^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/context.py", line 116, in wrapper_func
results = f(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/model_mixin.py", line 1470, in forward
returnsuper().forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1000, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1844, in _call_impl
returninner()
^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1790, in inner
result = forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/models/llama/modeling_llama.py", line 437, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/models/llama/modeling_llama.py", line 322, in forward
query_states = self.q_proj(hidden_states)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/methods/lora.py", line 521, in forward
state = self.compose(adapter_setup, state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/methods/adapter_layer_base.py", line 520, in compose
state = composition_func(adapter_setup, state, lvl=0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/methods/adapter_layer_base.py", line 354, in compose_stack
state = self.compose_single(adapter_stack_layer, state, lvl=lvl + 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/methods/lora.py", line 503, in compose_single
hidden_states, gate = lora(state.hidden_states, state.layer_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/alex/miniconda3/envs/adapters_baseline/lib/python3.11/site-packages/adapters/methods/lora.py", line 95, in forward
hidden_states = self.lora_dropout(hidden_states) @ torch.t(self.lora_A) @ torch.t(self.lora_B)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != float
Expected behavior
Adapters automatically adjusts dtype to model's dtype, or option to manually configure in the AdapterConfig, which doesn't currently seem to exist
The text was updated successfully, but these errors were encountered:
killershrimp
changed the title
[BUG] (draft)
[BUG] Adapters-initialized tensors sometimes conflict with model dtypes, no way to set automatically
Dec 10, 2024
Environment info
adapters
version: 1.0.1Information
Model I am using (Bert, XLNet ...): Llama 3.2 1B (English)
Adapter setup I am using (if any): LoRA/ReFT
The task I am working on is: finetuning on boolq
To reproduce
(note: same error results if you swap LoRA stuff with ReFT)
Stacktrace (adapters-initialized matrices have incompatible tensor dtype with model)
Expected behavior
Adapters automatically adjusts
dtype
to model's dtype, or option to manually configure in theAdapterConfig
, which doesn't currently seem to existThe text was updated successfully, but these errors were encountered: