Skip to content

When I set the activate='relu', I met the RuntimeError. (Solved) #247

Open
@liuwang0713

Description

@liuwang0713

When I set the activate='relu' in CSPDarknet53.py, line 35, I met the following RuntimeError.
(NV A100, CUDA 11.4, PyTorch 1.10.1) (in other server with different version is ok)

Traceback (most recent call last):
  File "train.py", line 308, in <module>
    Trainer(
  File "train.py", line 196, in train
    loss.backward()
  File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [12, 512, 13, 13]], which is output 0 of ReluBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Finally, I solved the problem by adjust the code
out += residual -> out = out + residual in CSPDarknet53.py, line108.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions