AttributeError: module 'torch.nn.functional' has no attribute 'one_hot' #8

wongdarrell · 2019-09-15T20:04:53Z

Hi, I downloaded and ran your program, and got a training error as above. I have no GPU, so I changed the setup to fp16 = 'false' (xlnet left as your demo choice).

What's the problem?

DarrellWong
code:
if args['do_train']:
train_dataset = load_and_cache_examples(task, tokenizer)
global_step, tr_loss = train(train_dataset, model, tokenizer)
logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)
------------------------------------------------------------ output window----
INFO:main:Creating features from dataset file at data/
100%|████████████████████████████████| 560000/560000 [05:07<00:00, 1823.33it/s]
INFO:main:Saving features into cached file data/cached_train_xlnet-base-cased_128_binary
INFO:main:***** Running training *****
INFO:main: Num examples = 560000
INFO:main: Num Epochs = 1
INFO:main: Total train batch size = 8
INFO:main: Gradient Accumulation steps = 1
INFO:main: Total optimization steps = 70000
Epoch: 0%| | 0/1 [00:00<?, ?it/s]

HBox(children=(IntProgress(value=0, description='Iteration', max=70000, style=ProgressStyle(description_width=…
-----------------and then error messages --------------------

AttributeError Traceback (most recent call last)
in
1 if args['do_train']:
2 train_dataset = load_and_cache_examples(task, tokenizer)
----> 3 global_step, tr_loss = train(train_dataset, model, tokenizer)
4 logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)

in train(train_dataset, model, tokenizer)
43 'token_type_ids': batch[2] if args['model_type'] in ['bert', 'xlnet'] else None, # XLM don't use segment_ids
44 'labels': batch[3]}
---> 45 outputs = model(**inputs)
46 loss = outputs[0] # model outputs are always tuple in pytorch-transformers (see doc)
47 print("\r%f" % loss, end='')

~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\pytorch_transformers\modeling_xlnet.py in forward(self, input_ids, token_type_ids, input_mask, attention_mask, mems, perm_mask, target_mapping, labels, head_mask)
1120 input_mask=input_mask, attention_mask=attention_mask,
1121 mems=mems, perm_mask=perm_mask, target_mapping=target_mapping,
-> 1122 head_mask=head_mask)
1123 output = transformer_outputs[0]
1124

~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~\AppData\Local\Continuum\anaconda3\envs\transformers\lib\site-packages\pytorch_transformers\modeling_xlnet.py in forward(self, input_ids, token_type_ids, input_mask, attention_mask, mems, perm_mask, target_mapping, head_mask)
920 # 1 indicates not in the same segment [qlen x klen x bsz]
921 seg_mat = (token_type_ids[:, None] != cat_ids[None, :]).long()
--> 922 seg_mat = F.one_hot(seg_mat, num_classes=2).to(dtype_float)
923 else:
924 seg_mat = None

AttributeError: module 'torch.nn.functional' has no attribute 'one_hot'

The text was updated successfully, but these errors were encountered:

ThilinaRajapakse · 2019-09-16T02:17:25Z

It looks like your Pytorch is out of date. Can you update it and try again?

wongdarrell · 2019-09-18T07:12:08Z

Hi Thilina,
Your suggestion worked. However, its now been about 32h of processing, and it is up to :
INFO:main:Saving model checkpoint to outputs/checkpoint-8000
with no sign of stopping. What is the last checkpoint count in your default run (1 epoch).
Also, what's the setting if I want to freeze all weights except the last layer?
Thanks

ThilinaRajapakse · 2019-09-18T12:05:52Z

Unfortunately, with no GPU your training speed will be slow. I can't remember the total number of steps, but it should be there in the output right before training starts. Checkpoint-8000 means that 8000 steps have been completed. There should also be a tqdm progress bar with approximate time remaining to completion.

I can't remember that off the top of my head but I can get back to you in a few hours on freezing layers. But, usually fine-tuning transformer models is done without freezing any of the layers.

I think it would be best if you used Google Colab with GPU rather than running it locally if a GPU is not available.

wongdarrell · 2019-09-18T22:40:28Z

If you are correct, it has only achieved 8000/70000 that you had embedded as t_total = 70000.
Regrading fine-tuning, I thought that several model examples using transformer learning freeze all weight layers but the last layer, which is usually 'new-problem' specific.
It seems that I will need to switch to colab then.

ThilinaRajapakse · 2019-09-19T13:58:14Z

For most transfer learning tasks, you would usually freeze the earlier layers. But in the case of BERT and other derivatives, the approach is to fine-tune all parameters, albeit for only a few epochs. This was the same approach used in the BERT paper.

For each task, we simply plug in the task specific inputs and outputs into BERT and finetune all the parameters end-to-end.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: module 'torch.nn.functional' has no attribute 'one_hot' #8

AttributeError: module 'torch.nn.functional' has no attribute 'one_hot' #8

wongdarrell commented Sep 15, 2019

ThilinaRajapakse commented Sep 16, 2019

wongdarrell commented Sep 18, 2019

ThilinaRajapakse commented Sep 18, 2019

wongdarrell commented Sep 18, 2019

ThilinaRajapakse commented Sep 19, 2019

AttributeError: module 'torch.nn.functional' has no attribute 'one_hot' #8

AttributeError: module 'torch.nn.functional' has no attribute 'one_hot' #8

Comments

wongdarrell commented Sep 15, 2019

HBox(children=(IntProgress(value=0, description='Iteration', max=70000, style=ProgressStyle(description_width=… -----------------and then error messages --------------------

ThilinaRajapakse commented Sep 16, 2019

wongdarrell commented Sep 18, 2019

ThilinaRajapakse commented Sep 18, 2019

wongdarrell commented Sep 18, 2019

ThilinaRajapakse commented Sep 19, 2019

HBox(children=(IntProgress(value=0, description='Iteration', max=70000, style=ProgressStyle(description_width=…
-----------------and then error messages --------------------