RuntimeError: CUDA error: device-side assert triggered #8

PopMeshgrid · 2019-10-23T08:08:16Z

/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File "train.py", line 364, in
decoder_input_init,decoder_hidden_init,attention_sum_init,decoder_attention_init)
File "train.py", line 212, in my_train
if int(y[0][i][di]) == 0:
RuntimeError: CUDA error: device-side assert triggered

when i use my dataset,it have above issue,how to solve it?

The text was updated successfully, but these errors were encountered:

PopMeshgrid · 2019-10-23T11:02:21Z

/opt/conda/conda-bld/pytorch_1550802451070/work/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1550802451070/work/aten/src/THCUNN/generic/ClassNLLCriterion.cu line=111 error=59 : device-side assert triggered
Traceback (most recent call last):
File "Train.py", line 360, in
decoder_input_init,decoder_hidden_init,attention_sum_init,decoder_attention_init)
File "Train.py", line 214, in my_train
loss += criterion(decoder_output[i], y[:,i,di])
File "/gpu/zhengtianxiang/soft/Anaconda1/envs/ocr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/gpu/zhengtianxiang/soft/Anaconda1/envs/ocr/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 210, in forward
return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
File "/gpu/zhengtianxiang/soft/Anaconda1/envs/ocr/lib/python3.6/site-packages/torch/nn/functional.py", line 1790, in nll_loss
ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1550802451070/work/aten/src/THCUNN/generic/ClassNLLCriterion.cu:111\

more details likes this

Jeremy-lf · 2019-10-24T02:41:17Z

It's just the Voc_size is not same as the actual size

ZacHu-ZYH · 2019-12-17T04:53:10Z

@PopMeshgrid Hi, I met the same problem, have u solved this problem? thx!

whywhs · 2019-12-17T07:18:34Z

I suggest you check the label's length of your dataset.
In CROHME dataset, there are 110 symbols plus 'eol' and 'sos'. So the label's length of my dataset is 112. You should change the parameter in Train.py(line 275) to make sure it is same with your label's length.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: device-side assert triggered #8

RuntimeError: CUDA error: device-side assert triggered #8

PopMeshgrid commented Oct 23, 2019

PopMeshgrid commented Oct 23, 2019

Jeremy-lf commented Oct 24, 2019

ZacHu-ZYH commented Dec 17, 2019

whywhs commented Dec 17, 2019

RuntimeError: CUDA error: device-side assert triggered #8

RuntimeError: CUDA error: device-side assert triggered #8

Comments

PopMeshgrid commented Oct 23, 2019

PopMeshgrid commented Oct 23, 2019

Jeremy-lf commented Oct 24, 2019

ZacHu-ZYH commented Dec 17, 2019

whywhs commented Dec 17, 2019