Open
Description
When I set the activate='relu'
in CSPDarknet53.py, line 35
, I met the following RuntimeError.
(NV A100, CUDA 11.4, PyTorch 1.10.1) (in other server with different version is ok)
Traceback (most recent call last):
File "train.py", line 308, in <module>
Trainer(
File "train.py", line 196, in train
loss.backward()
File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [12, 512, 13, 13]], which is output 0 of ReluBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Finally, I solved the problem by adjust the code
out += residual
-> out = out + residual
in CSPDarknet53.py, line108
.
Metadata
Metadata
Assignees
Labels
No labels