Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: self._train_loop should not be None when calling train method. Please provide train_dataloader, train_cfg, optimizer and param_scheduler arguments when initializing runner. #11697

Open
wdzwdxy opened this issue May 9, 2024 · 5 comments
Assignees

Comments

@wdzwdxy
Copy link

wdzwdxy commented May 9, 2024

在使用train.py 运行配置文件时
报错RuntimeError: self._train_loop should not be None when calling train method. Please provide train_dataloader, train_cfg, optimizer and param_scheduler arguments when initializing runner.

报错部分全览

System environment:
sys.platform: win32
Python: 3.10.14 | packaged by Anaconda, Inc. | (main, Mar 21 2024, 16:20:14) [MSC v.1916 64 bit (AMD64)]
CUDA available: True
MUSA available: False
numpy_random_seed: 895890165
GPU 0: NVIDIA GeForce RTX 3050 Laptop GPU
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.0
NVCC: Cuda compilation tools, release 12.0, V12.0.76
MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.39.33523 版
GCC: n/a
PyTorch: 2.2.2+cu121
PyTorch compiling details: PyTorch built with:

  • C++ Version: 201703

  • MSVC 192930151

  • Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications

  • Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)

  • OpenMP 2019

  • LAPACK is enabled (usually provided by MKL)

  • CPU capability usage: AVX2

  • CUDA Runtime 12.1

  • NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90

  • CuDNN 8.8.1 (built against CUDA 12.0)

  • Magma 2.5.4

  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.8.1, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

    TorchVision: 0.17.2+cu121
    OpenCV: 4.9.0
    MMEngine: 0.10.3

Runtime environment:
dist_cfg: {'backend': 'nccl'}
seed: 895890165
Distributed launcher: none
Distributed training: False
GPU number: 1

05/09 13:46:54 - mmengine - INFO - Config:
checkpoint_config = dict(interval=1)
classes = ('CultivatedLand', )
custom_hooks = [
dict(type='NumClassCheckHook'),
]
data = dict(
samples_per_gpu=1,
test=dict(
ann_file='out_shp/train/512-512/annotations/test.json',
classes=('CultivatedLand', ),
img_prefix='out_shp/train/512-512/test/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
flip=False,
img_scale=(
800,
800,
),
transforms=[
dict(keep_ratio=True, type='Resize'),
dict(type='RandomFlip'),
dict(
mean=[
123.675,
116.28,
103.53,
],
std=[
58.395,
57.12,
57.375,
],
to_rgb=True,
type='Normalize'),
dict(size_divisor=32, type='Pad'),
dict(keys=[
'img',
], type='ImageToTensor'),
dict(keys=[
'img',
], type='Collect'),
],
type='MultiScaleFlipAug'),
],
type='CocoDataset'),
train=dict(
ann_file='out_shp/train/512-512/annotations/train.json',
classes=('CultivatedLand', ),
img_prefix='out_shp/train/512-512/train/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(img_scale=(
600,
600,
), keep_ratio=True, type='Resize'),
dict(flip_ratio=0.5, type='RandomFlip'),
dict(
mean=[
123.675,
116.28,
103.53,
],
std=[
58.395,
57.12,
57.375,
],
to_rgb=True,
type='Normalize'),
dict(size_divisor=32, type='Pad'),
dict(type='DefaultFormatBundle'),
dict(
keys=[
'img',
'gt_bboxes',
'gt_labels',
'gt_masks',
],
type='Collect'),
],
type='CocoDataset'),
val=dict(
ann_file='out_shp/train/512-512/annotations/test.json',
classes=('CultivatedLand', ),
img_prefix='out_shp/train/512-512/test/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
flip=False,
img_scale=(
600,
600,
),
transforms=[
dict(keep_ratio=True, type='Resize'),
dict(type='RandomFlip'),
dict(
mean=[
123.675,
116.28,
103.53,
],
std=[
58.395,
57.12,
57.375,
],
to_rgb=True,
type='Normalize'),
dict(size_divisor=32, type='Pad'),
dict(keys=[
'img',
], type='ImageToTensor'),
dict(keys=[
'img',
], type='Collect'),
],
type='MultiScaleFlipAug'),
],
type='CocoDataset'),
workers_per_gpu=1)
data_root = 'out_shp/train/512-512'
data_test = 'out_shp/train/512-512/'
dataset_type = 'CocoDataset'
default_scope = 'mmdet'
dist_params = dict(backend='nccl')
evaluation = dict(metric=[
'bbox',
'segm',
])
filter_empty_gt = False
gpu_ids = range(0, 8)
img_norm_cfg = dict(
mean=[
123.675,
116.28,
103.53,
],
std=[
58.395,
57.12,
57.375,
],
to_rgb=True)
launcher = 'none'
load_from = 'out_shp/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco_20200312-946fd751.pth'
log_config = dict(
hooks=[
dict(type='TextLoggerHook'),
], interval=50)
log_level = 'INFO'
lr_config = dict(
policy='step',
step=[
8,
11,
],
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001)
model = dict(
backbone=dict(
base_width=4,
dcn=dict(deform_groups=1, fallback_on_stride=False, type='DCN'),
depth=101,
frozen_stages=1,
groups=64,
init_cfg=dict(
checkpoint='open-mmlab://resnext101_64x4d', type='Pretrained'),
norm_cfg=dict(requires_grad=True, type='BN'),
norm_eval=True,
num_stages=4,
out_indices=(
0,
1,
2,
3,
),
stage_with_dcn=(
False,
True,
True,
True,
),
style='pytorch',
type='ResNeXt'),
neck=dict(
in_channels=[
256,
512,
1024,
2048,
],
num_outs=5,
out_channels=256,
type='FPN'),
roi_head=dict(
bbox_head=[
dict(
bbox_coder=dict(
target_means=[
0.0,
0.0,
0.0,
0.0,
],
target_stds=[
0.1,
0.1,
0.2,
0.2,
],
type='DeltaXYWHBBoxCoder'),
fc_out_channels=1024,
in_channels=256,
loss_bbox=dict(beta=1.0, loss_weight=1.0, type='SmoothL1Loss'),
loss_cls=dict(
loss_weight=1.0,
type='CrossEntropyLoss',
use_sigmoid=False),
num_classes=1,
reg_class_agnostic=True,
roi_feat_size=7,
type='Shared2FCBBoxHead'),
dict(
bbox_coder=dict(
target_means=[
0.0,
0.0,
0.0,
0.0,
],
target_stds=[
0.05,
0.05,
0.1,
0.1,
],
type='DeltaXYWHBBoxCoder'),
fc_out_channels=1024,
in_channels=256,
loss_bbox=dict(beta=1.0, loss_weight=1.0, type='SmoothL1Loss'),
loss_cls=dict(
loss_weight=1.0,
type='CrossEntropyLoss',
use_sigmoid=False),
num_classes=1,
reg_class_agnostic=True,
roi_feat_size=7,
type='Shared2FCBBoxHead'),
dict(
bbox_coder=dict(
target_means=[
0.0,
0.0,
0.0,
0.0,
],
target_stds=[
0.033,
0.033,
0.067,
0.067,
],
type='DeltaXYWHBBoxCoder'),
fc_out_channels=1024,
in_channels=256,
loss_bbox=dict(beta=1.0, loss_weight=1.0, type='SmoothL1Loss'),
loss_cls=dict(
loss_weight=1.0,
type='CrossEntropyLoss',
use_sigmoid=False),
num_classes=1,
reg_class_agnostic=True,
roi_feat_size=7,
type='Shared2FCBBoxHead'),
],
bbox_roi_extractor=dict(
featmap_strides=[
4,
8,
16,
32,
],
out_channels=256,
roi_layer=dict(output_size=7, sampling_ratio=0, type='RoIAlign'),
type='SingleRoIExtractor'),
interleaved=True,
mask_head=[
dict(
conv_out_channels=256,
in_channels=256,
loss_mask=dict(
loss_weight=1.0, type='CrossEntropyLoss', use_mask=True),
num_classes=1,
num_convs=4,
type='HTCMaskHead',
with_conv_res=False),
dict(
conv_out_channels=256,
in_channels=256,
loss_mask=dict(
loss_weight=1.0, type='CrossEntropyLoss', use_mask=True),
num_classes=1,
num_convs=4,
type='HTCMaskHead'),
dict(
conv_out_channels=256,
in_channels=256,
loss_mask=dict(
loss_weight=1.0, type='CrossEntropyLoss', use_mask=True),
num_classes=1,
num_convs=4,
type='HTCMaskHead'),
],
mask_info_flow=True,
mask_roi_extractor=dict(
featmap_strides=[
4,
8,
16,
32,
],
out_channels=256,
roi_layer=dict(output_size=14, sampling_ratio=0, type='RoIAlign'),
type='SingleRoIExtractor'),
num_stages=3,
stage_loss_weights=[
1,
0.5,
0.25,
],
type='HybridTaskCascadeRoIHead'),
rpn_head=dict(
anchor_generator=dict(
ratios=[
0.5,
1.0,
2.0,
],
scales=[
8,
],
strides=[
4,
8,
16,
32,
64,
],
type='AnchorGenerator'),
bbox_coder=dict(
target_means=[
0.0,
0.0,
0.0,
0.0,
],
target_stds=[
1.0,
1.0,
1.0,
1.0,
],
type='DeltaXYWHBBoxCoder'),
feat_channels=256,
in_channels=256,
loss_bbox=dict(
beta=0.1111111111111111, loss_weight=1.0, type='SmoothL1Loss'),
loss_cls=dict(
loss_weight=1.0, type='CrossEntropyLoss', use_sigmoid=True),
type='RPNHead'),
test_cfg=dict(
rcnn=dict(
mask_thr_binary=0.5,
max_per_img=100,
nms=dict(iou_threshold=0.5, type='soft_nms'),
score_thr=0.001),
rpn=dict(
max_per_img=1000,
min_bbox_size=0,
nms=dict(iou_threshold=0.7, type='nms'),
nms_pre=1000)),
train_cfg=dict(
rcnn=[
dict(
assigner=dict(
ignore_iof_thr=-1,
min_pos_iou=0.5,
neg_iou_thr=0.5,
pos_iou_thr=0.5,
type='MaxIoUAssigner'),
debug=False,
mask_size=28,
mask_thr_binary=0.5,
pos_weight=-1,
sampler=dict(
add_gt_as_proposals=True,
neg_pos_ub=-1,
num=512,
pos_fraction=0.25,
type='RandomSampler')),
dict(
assigner=dict(
ignore_iof_thr=-1,
min_pos_iou=0.6,
neg_iou_thr=0.6,
pos_iou_thr=0.6,
type='MaxIoUAssigner'),
debug=False,
mask_size=28,
mask_thr_binary=0.5,
pos_weight=-1,
sampler=dict(
add_gt_as_proposals=True,
neg_pos_ub=-1,
num=512,
pos_fraction=0.25,
type='RandomSampler')),
dict(
assigner=dict(
ignore_iof_thr=-1,
min_pos_iou=0.7,
neg_iou_thr=0.7,
pos_iou_thr=0.7,
type='MaxIoUAssigner'),
debug=False,
mask_size=28,
mask_thr_binary=0.5,
pos_weight=-1,
sampler=dict(
add_gt_as_proposals=True,
neg_pos_ub=-1,
num=512,
pos_fraction=0.25,
type='RandomSampler')),
],
rpn=dict(
allowed_border=0,
assigner=dict(
ignore_iof_thr=-1,
min_pos_iou=0.3,
neg_iou_thr=0.3,
pos_iou_thr=0.7,
type='MaxIoUAssigner'),
debug=False,
pos_weight=-1,
sampler=dict(
add_gt_as_proposals=False,
neg_pos_ub=-1,
num=256,
pos_fraction=0.5,
type='RandomSampler')),
rpn_proposal=dict(
max_per_img=2000,
min_bbox_size=0,
nms=dict(iou_threshold=0.7, type='nms'),
nms_pre=2000)),
type='HybridTaskCascade')
optimizer = dict(lr=0.02, momentum=0.9, type='SGD', weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
resume_from = None
root_path = 'out_shp/train/512-512/'
runner = dict(max_epochs=12, type='EpochBasedRunner')
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
flip=False,
img_scale=(
800,
800,
),
transforms=[
dict(keep_ratio=True, type='Resize'),
dict(type='RandomFlip'),
dict(
mean=[
123.675,
116.28,
103.53,
],
std=[
58.395,
57.12,
57.375,
],
to_rgb=True,
type='Normalize'),
dict(size_divisor=32, type='Pad'),
dict(keys=[
'img',
], type='ImageToTensor'),
dict(keys=[
'img',
], type='Collect'),
],
type='MultiScaleFlipAug'),
]
test_root = 'out_shp/train/512-512/'
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(img_scale=(
600,
600,
), keep_ratio=True, type='Resize'),
dict(flip_ratio=0.5, type='RandomFlip'),
dict(
mean=[
123.675,
116.28,
103.53,
],
std=[
58.395,
57.12,
57.375,
],
to_rgb=True,
type='Normalize'),
dict(size_divisor=32, type='Pad'),
dict(type='DefaultFormatBundle'),
dict(keys=[
'img',
'gt_bboxes',
'gt_labels',
'gt_masks',
], type='Collect'),
]
work_dir = 'out_shp/train/512-512/'
workflow = [
(
'train',
1,
),
]

05/09 13:46:58 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.
05/09 13:46:58 - mmengine - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) RuntimeInfoHook
(BELOW_NORMAL) LoggerHook

before_train:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(VERY_LOW ) CheckpointHook

before_train_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(NORMAL ) DistSamplerSeedHook
(NORMAL ) NumClassCheckHook

before_train_iter:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook

after_train_iter:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook

after_train_epoch:
(NORMAL ) IterTimerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook

before_val:
(VERY_HIGH ) RuntimeInfoHook

before_val_epoch:
(NORMAL ) IterTimerHook
(NORMAL ) NumClassCheckHook

before_val_iter:
(NORMAL ) IterTimerHook

after_val_iter:
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook

after_val_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW ) ParamSchedulerHook
(VERY_LOW ) CheckpointHook

after_val:
(VERY_HIGH ) RuntimeInfoHook

after_train:
(VERY_HIGH ) RuntimeInfoHook
(VERY_LOW ) CheckpointHook

before_test:
(VERY_HIGH ) RuntimeInfoHook

before_test_epoch:
(NORMAL ) IterTimerHook

before_test_iter:
(NORMAL ) IterTimerHook

after_test_iter:
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook

after_test_epoch:
(VERY_HIGH ) RuntimeInfoHook
(NORMAL ) IterTimerHook
(BELOW_NORMAL) LoggerHook

after_test:
(VERY_HIGH ) RuntimeInfoHook

after_run:
(BELOW_NORMAL) LoggerHook

Traceback (most recent call last):
File "D:\codae\mmdetection\tools\train.py", line 121, in
main()
File "D:\codae\mmdetection\tools\train.py", line 117, in main
runner.train()
File "C:\Users\gang.conda\envs\pytorch\lib\site-packages\mmengine\runner\runner.py", line 1722, in train
raise RuntimeError(
RuntimeError: self._train_loop should not be None when calling train method. Please provide train_dataloader, train_cfg, optimizer and param_scheduler arguments when initializing runner.

进程已结束,退出代码为 1

@wdzwdxy
Copy link
Author

wdzwdxy commented May 9, 2024

报错信息为 train_dataloader、train_cfg、optimizer 和 param_scheduler 参数没有提供 但是在配置文件中有数据加载,优化器这些内容

@yurujaja
Copy link

yurujaja commented Jun 2, 2024

same issues here

@zzzcccxx
Copy link

报错信息为 train_dataloader、train_cfg、optimizer 和 param_scheduler 参数没有提供 但是在配置文件中有数据加载,优化器这些内容

那请问最后的解决办法是什么呢?在哪里添加这些配置呢?

@xiuqin0
Copy link

xiuqin0 commented Jul 12, 2024

请问解决了吗?

@corgok
Copy link

corgok commented Jul 30, 2024

I just started with the Tutorial.ipynb and noticed that it is written on UBUNTU. Could it be a windows thing? Edit: probably not. I just solved it by moving the script into the base folder of the project and setting the cfg.work_dir relative from there. and that is how it is set up in the tutorial. https://colab.research.google.com/github/open-mmlab/mmocr/blob/dev-1.x/demo/tutorial.ipynb#scrollTo=67OJ6oAvN6NA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants