You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
modules = {}
for n, m in model.named_modules():
for np, p in m.named_parameters(recurse=False):
if p is None:
continue
key = n + '.' + np
if key in modules:
assert id(p) == id(modules[key][0]), (n, np, p.shape, modules[key][0].shape)
continue
modules[key] = (p, m)
n_params = len(list(self.model.named_parameters())) # There is a bug that named_parameters cannot print all parameters
assert len(modules) == n_params, (len(modules), n_params)
PRINT:
134 != 133
So there is a layer of parameters is not register
IT IS heads.default.3.weight
And I found its bias can print normally.
The text was updated successfully, but these errors were encountered:
I believe this behavior happens due to weight sharing between the input embeddings layer and the output projection of the default head (heads.default.3). By default, shared weights are not yielded twice in named_parameters(), meaning when iterating over the full model's parameters, you only get the input embedding projection and not the (identical) output projection in the head. You can set named_parameters(remove_duplicate=False) to also output shared weights, which should then also return heads.default.3 See here.
adapters-1.0.0 tokenizers-0.19.1 transformers-4.43.4
BUG:
RUN this code:
PRINT:
134 != 133
So there is a layer of parameters is not register
IT IS heads.default.3.weight
And I found its bias can print normally.
The text was updated successfully, but these errors were encountered: