-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: module 'torch.nn.functional' has no attribute 'one_hot' #8
Comments
It looks like your Pytorch is out of date. Can you update it and try again? |
Hi Thilina, |
Unfortunately, with no GPU your training speed will be slow. I can't remember the total number of steps, but it should be there in the output right before training starts. Checkpoint-8000 means that 8000 steps have been completed. There should also be a tqdm progress bar with approximate time remaining to completion. I can't remember that off the top of my head but I can get back to you in a few hours on freezing layers. But, usually fine-tuning transformer models is done without freezing any of the layers. I think it would be best if you used Google Colab with GPU rather than running it locally if a GPU is not available. |
If you are correct, it has only achieved 8000/70000 that you had embedded as t_total = 70000. |
For most transfer learning tasks, you would usually freeze the earlier layers. But in the case of BERT and other derivatives, the approach is to fine-tune all parameters, albeit for only a few epochs. This was the same approach used in the BERT paper.
|
Hi, I downloaded and ran your program, and got a training error as above. I have no GPU, so I changed the setup to fp16 = 'false' (xlnet left as your demo choice).
What's the problem?
DarrellWong
code:
if args['do_train']:
train_dataset = load_and_cache_examples(task, tokenizer)
global_step, tr_loss = train(train_dataset, model, tokenizer)
logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)
------------------------------------------------------------ output window----
INFO:main:Creating features from dataset file at data/
100%|████████████████████████████████| 560000/560000 [05:07<00:00, 1823.33it/s]
INFO:main:Saving features into cached file data/cached_train_xlnet-base-cased_128_binary
INFO:main:***** Running training *****
INFO:main: Num examples = 560000
INFO:main: Num Epochs = 1
INFO:main: Total train batch size = 8
INFO:main: Gradient Accumulation steps = 1
INFO:main: Total optimization steps = 70000
Epoch: 0%| | 0/1 [00:00<?, ?it/s]
HBox(children=(IntProgress(value=0, description='Iteration', max=70000, style=ProgressStyle(description_width=…
-----------------and then error messages --------------------
AttributeError Traceback (most recent call last)
in
1 if args['do_train']:
2 train_dataset = load_and_cache_examples(task, tokenizer)
----> 3 global_step, tr_loss = train(train_dataset, model, tokenizer)
4 logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)
in train(train_dataset, model, tokenizer)
43 'token_type_ids': batch[2] if args['model_type'] in ['bert', 'xlnet'] else None, # XLM don't use segment_ids
44 'labels': batch[3]}
---> 45 outputs = model(**inputs)
46 loss = outputs[0] # model outputs are always tuple in pytorch-transformers (see doc)
47 print("\r%f" % loss, end='')
~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)
~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\pytorch_transformers\modeling_xlnet.py in forward(self, input_ids, token_type_ids, input_mask, attention_mask, mems, perm_mask, target_mapping, labels, head_mask)
1120 input_mask=input_mask, attention_mask=attention_mask,
1121 mems=mems, perm_mask=perm_mask, target_mapping=target_mapping,
-> 1122 head_mask=head_mask)
1123 output = transformer_outputs[0]
1124
~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)
~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\pytorch_transformers\modeling_xlnet.py in forward(self, input_ids, token_type_ids, input_mask, attention_mask, mems, perm_mask, target_mapping, head_mask)
920 #
1
indicates not in the same segment [qlen x klen x bsz]921 seg_mat = (token_type_ids[:, None] != cat_ids[None, :]).long()
--> 922 seg_mat = F.one_hot(seg_mat, num_classes=2).to(dtype_float)
923 else:
924 seg_mat = None
AttributeError: module 'torch.nn.functional' has no attribute 'one_hot'
The text was updated successfully, but these errors were encountered: