Different results from train.py and val.py #12965

xibici · 2024-04-26T09:27:38Z

Search before asking

I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

after the train.py is over, I got the valiadating result:

Validating runs/train/exp212/weights/best.pt...
Fusing layers...
summary: 360 layers, 4124308 parameters, 0 gradients, 10.9 GFLOPs
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 7/7 [00:12<00:00, 1.80s/it]
all 1629 5983 0.853 0.765 0.844 0.386

but when i run val.py alone and i got:

summary: 360 layers, 4124308 parameters, 0 gradients, 10.9 GFLOPs
val: Scanning /home/xjh/Pro/PythonPro/datasets/ContactHands/labels/val.cache... 1629 images, 0 backgrounds, 0 corrupt: 100%|██████████| 1629/1629 [00:00<?, ?it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 7/7 [00:38<00:00, 5.48s/it]
all 1629 5983 0.849 0.773 0.844 0.386
Speed: 0.1ms pre-process, 1.1ms inference, 0.8ms NMS per image at shape (256, 3, 640, 640)

why the the result of map is the same,but P and R is different?how can I get the original result from best.pt.

P_train.py 0.853
P_val.py 0.849

R_train.py 0.765
R_val.py 0.773

Additional

No response

glenn-jocher · 2024-04-26T11:29:55Z

Hey there! 👋

It looks like you're noticing slight differences in Precision (P) and Recall (R) between your training (via train.py) and validation (via val.py) processes, although mAP scores remain consistent.

This variation is quite normal and can be attributed to a few factors, especially considering that P and R are more sensitive to the specifics of the dataset and detection thresholding than mAP scores. A slight change in model's confidence on certain samples can influence P and R without significantly affecting the mAP. It's also important to remember that during training, data augmentation and other regularization techniques might slightly alter the model's behavior on your training data compared to the validation phase, where your data is not augmented.

To ensure you're getting consistent evaluations, make sure to use the same dataset, model weights (like your best.pt), and evaluation settings (e.g., confidence thresholds, NMS settings) in both training and validation scripts.

If the slight variation in P and R still persists, this is generally not a cause for concern if your mAP scores are stable. However, ensuring model consistency across every metric is key to understanding your model's performance fully.

Hope this helps clear things up! Keep exploring with YOLOv5, and don't hesitate to reach out if you have more questions. Happy detecting! 🚀

github-actions · 2024-05-27T00:22:27Z

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

xibici added the question Further information is requested label Apr 26, 2024

github-actions bot added the Stale label May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different results from train.py and val.py #12965

Different results from train.py and val.py #12965

xibici commented Apr 26, 2024

glenn-jocher commented Apr 26, 2024

github-actions bot commented May 27, 2024

Different results from train.py and val.py #12965

Different results from train.py and val.py #12965

Comments

xibici commented Apr 26, 2024

Search before asking

Question

Additional

glenn-jocher commented Apr 26, 2024

github-actions bot commented May 27, 2024