Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: device-side assert triggered #223

Open
ycc66104116 opened this issue Apr 24, 2022 · 4 comments
Open

RuntimeError: CUDA error: device-side assert triggered #223

ycc66104116 opened this issue Apr 24, 2022 · 4 comments

Comments

@ycc66104116
Copy link

hi, recently i use my own dataset and run Deeplab V3+, but i got the error.
i think this is about the classes but i sure that my classes is 6+1(background), and i have changed the number in utils.py and .py. and i really don't know how to fix it. does anyone know this? i will very appreciate if anyone can help me fix this problem.

BTW. i used to run successfully with 2 classes, but when i used other dataset and change to 7, it run out the error.
and .py is modified from pascal.py, i only change the class name, num classes and base dir to my dataset. other parts maintain the same as pascal.py.

--------------my error message-----------
C:\w\b\windows\pytorch\aten\src\ATen\native\cuda\NLLLoss2d.cu:95: block: [0,0,0], thread: [708,0,0] Assertion t >= 0 && t < n_classes failed.
C:\w\b\windows\pytorch\aten\src\ATen\native\cuda\NLLLoss2d.cu:95: block: [0,0,0], thread: [709,0,0] Assertion t >= 0 && t < n_classes failed.
C:\w\b\windows\pytorch\aten\src\ATen\native\cuda\NLLLoss2d.cu:95: block: [0,0,0], thread: [710,0,0] Assertion t >= 0 && t < n_classes failed.
...
Traceback (most recent call last):
File "train.py", line 388, in
main()
File "train.py", line 374, in main
trainer.training(epoch)
File "train.py", line 134, in training
loss = self.criterion(output, target)
File "N:\pytorch-deeplab-xception-master\utils\loss.py", line 28, in CrossEntropyLoss
loss = criterion(logit, target.long())
File "C:\Users\LOC\anaconda3\envs\envfordeeplab1229\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\LOC\anaconda3\envs\envfordeeplab1229\lib\site-packages\torch\nn\modules\loss.py", line 1152, in forward
label_smoothing=self.label_smoothing)
File "C:\Users\LOC\anaconda3\envs\envfordeeplab1229\lib\site-packages\torch\nn\functional.py", line 2846, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: CUDA error: device-side assert triggered

@xieyufei-SLAM
Copy link

I have met the same questions as you.Would you deal with it yet? One month ago,I can run it with no problem. But this time,when I git it again, it reported this ERROR, that's so strange.

@ycc66104116
Copy link
Author

yes i can run the code now, however i haven't try multi classes yet. now my data only contains 1 kind target and background.
i processed my label data as indices, which means only 0 and 1 (cause only 2 classes now), the two value in the label image.

@chiba1sonny
Copy link

I have met the same questions as you.Would you deal with it yet? One month ago,I can run it with no problem. But this time,when I git it again, it reported this ERROR, that's so strange.

hey, did you fix the problem?

@123Bruceche
Copy link

I have met the same questions as you.Would you deal with it yet? One month ago,I can run it with no problem. But this time,when I git it again, it reported this ERROR, that's so strange.

hello,did you fix the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants