How to achieve base detector weight pretrained by ILSVRC2015 and ILSVRC #103

yuyangyangji · 2024-10-15T08:25:52Z

thanks for the great work of yolov and yolov++, in this repo, you mention how to fine-tune the model from a ptr-trained yolox model. However, i find that in the paper, base detector is trained with ILSVRC2015 and ILSVRC dataset, i wonder whether this repo provide the code for us to achieve the pre-trainined weight? thanks, hope for your answering!

YuHengsss · 2024-10-15T08:33:40Z

Thanks for your interst in our work. We indeed provide code to train the base detector. Take the imagenet vid as an example, the experiment file to train a base detector (e.g. yolox-s) could be found here:

YOLOV/exps/yolov/yoloxs_vid.py

Line 6 in 0a510c3

class Exp(MyExp):

You could use tools/train.py and the experiment file to train the base detector. Their usage is same as YOLOX.

yuyangyangji · 2024-10-16T03:27:14Z

thanks! for the training stage, first stage is to start with coco pretrained weights，freeze backbone, only fine-tune linear projection layers in YOLOX prediction head using sampled ILSVRC2015 and ILSVRC dataset. second stage is to use full ILSVRC2015, freeze backbone and fine-tune prediction head and newly added video object classification branch and the FAM. Does this description has bias with the origin paper? I wonder weather the first stage training need to train FAM module and new added classification branch, hope for your answering! thanks!

YuHengsss · 2024-10-16T03:33:32Z

Hello, you need to finetune all the coco pretrained weights in the first stage, NOT only the linear projection head. The procedure of the second stage is correct.

yuyangyangji · 2024-10-31T06:55:05Z

hello, now i am successfully train the yolov++ and i have some question about feature select module, in the paper you mentioned that we use a threshold to pick up which proposals to be select to apply FAM module, and the number each frame is always under 100 per frame, however, when i am training in second stage mention above(v++ base decoupledreg_2x version), i find that if we use default setting of repo, the number of proposal we chose is exactly very high, i observe that more than 70% of the proposals are selected, is it within expectation? how about if i directly use the proposals selected by sim-ota per frame? hope for your answering! thanks!

YuHengsss · 2024-10-31T07:14:31Z

This phenomenon is intriguing. The number of candidates selected by the feature selection module depends on both the quality of the base detector and the characteristics of the image. Could you provide more details about the dataset you are using? With such a large number of candidates, the GPU memory cost will be extremely high and expect to meet a OOM error.
Additionally, the average proposal number reported in Table 2 of our paper represents an average value, not the minimum.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to achieve base detector weight pretrained by ILSVRC2015 and ILSVRC #103

How to achieve base detector weight pretrained by ILSVRC2015 and ILSVRC #103

yuyangyangji commented Oct 15, 2024

YuHengsss commented Oct 15, 2024

yuyangyangji commented Oct 16, 2024 •

edited

Loading

YuHengsss commented Oct 16, 2024

yuyangyangji commented Oct 31, 2024

YuHengsss commented Oct 31, 2024

How to achieve base detector weight pretrained by ILSVRC2015 and ILSVRC #103

How to achieve base detector weight pretrained by ILSVRC2015 and ILSVRC #103

Comments

yuyangyangji commented Oct 15, 2024

YuHengsss commented Oct 15, 2024

yuyangyangji commented Oct 16, 2024 • edited Loading

YuHengsss commented Oct 16, 2024

yuyangyangji commented Oct 31, 2024

YuHengsss commented Oct 31, 2024

yuyangyangji commented Oct 16, 2024 •

edited

Loading