New feature: adding a parameter to control the number of pseudo-validation cases #2061

ancestor-mithril · 2024-04-01T15:17:19Z

This feature allows to do fewer iterations of pseudo-validation instead of 50, allowing for faster training or more confident pseudo-validation score, depending on the needs of each experiment. It also works together with #2053 by allowing no additional processes for pseudo-validation if -val_iters is set to 1.

While this value could be adjusted by creating a custom trainer, it does not influence the training results, therefore there's no need to create a custom trainer for it.

FabianIsensee · 2024-04-09T18:15:34Z

Hey, I just did a quick benchmark of the time the validation takes:

2024-04-09 20:11:50.785168: Epoch 0
2024-04-09 20:11:50.785251: Current learning rate: 0.01
train time 8.048835515975952
val time 0.5883698463439941
2024-04-09 20:11:59.424580: train_loss -0.3002
2024-04-09 20:11:59.424691: val_loss -0.7542
2024-04-09 20:11:59.424762: Pseudo dice [0.8257, 0.8178]
2024-04-09 20:11:59.424825: Epoch time: 8.64 s
2024-04-09 20:11:59.424881: Yayy! New best EMA pseudo Dice: 0.8218
2024-04-09 20:12:00.269000:
2024-04-09 20:12:00.269072: Epoch 1
2024-04-09 20:12:00.269136: Current learning rate: 0.00999
train time 6.751651048660278
val time 0.6082313060760498
2024-04-09 20:12:07.630979: train_loss -0.7611
2024-04-09 20:12:07.631112: val_loss -0.7954
2024-04-09 20:12:07.631186: Pseudo dice [0.8494, 0.846]
2024-04-09 20:12:07.631252: Epoch time: 7.36 s
2024-04-09 20:12:07.631310: Yayy! New best EMA pseudo Dice: 0.8244
2024-04-09 20:12:08.717944:
2024-04-09 20:12:08.718058: Epoch 2
2024-04-09 20:12:08.718133: Current learning rate: 0.00998
train time 6.752732515335083
val time 0.5877139568328857
2024-04-09 20:12:16.060827: train_loss -0.7902
2024-04-09 20:12:16.060989: val_loss -0.815
2024-04-09 20:12:16.061144: Pseudo dice [0.8695, 0.855]
2024-04-09 20:12:16.061276: Epoch time: 7.34 s
2024-04-09 20:12:16.061400: Yayy! New best EMA pseudo Dice: 0.8282

Validation in its default setting is less than 10% of the runtime, which seems very reasonable to me. Do you really think that this is an urgently needed feature? I would like to avoid changing the parametrization if the trainers init as much as possible because then everyone would have to go through all their code and modify that in their custom trainers. That would affect hundrets of trainers in our case

ancestor-mithril · 2024-04-10T10:12:38Z

Do you really think that this is an urgently needed feature?

No, I would not say this is urgently needed. It would allow customizing the number of pseudo-validation steps, but it's not something critical. Several trainers (e.g.: StuNet v1) already use 1 validation step for training.

Validation in its default setting is less than 10% of the runtime.

This is not always the case. When using HDD or when having other I/O bottleneck, the validation takes a lot of time.
Here I train using compressed data on HDD (npz), with 50 validation steps. The first value after "Epoch time" is the training time, and the second is the validation time:

Here I reduce the number of validation steps to 1:

I would like to avoid changing the parametrization if the trainers init as much as possible because then everyone would have to go through all their code and modify that in their custom trainers.

Yes, I agree with this, that's why I added the new parameter at the end, with a default value. By doing this I expect that all trainers inheriting from already existing nnUnet trainers would use the default value and further configurations after the super().__init__() call will not interfere with this new parameter. Would you prefer this parameter to be an environment variable?

New parameter for decreasing the number of validation steps

3952d87

FabianIsensee self-assigned this Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New feature: adding a parameter to control the number of pseudo-validation cases #2061

New feature: adding a parameter to control the number of pseudo-validation cases #2061

ancestor-mithril commented Apr 1, 2024

FabianIsensee commented Apr 9, 2024

ancestor-mithril commented Apr 10, 2024

New feature: adding a parameter to control the number of pseudo-validation cases #2061

Are you sure you want to change the base?

New feature: adding a parameter to control the number of pseudo-validation cases #2061

Conversation

ancestor-mithril commented Apr 1, 2024

FabianIsensee commented Apr 9, 2024

ancestor-mithril commented Apr 10, 2024