Add start_epoch variable to base trainer train() #92

liamchalcroft · 2023-06-12T05:48:14Z

As far as I can tell, there is no way to resume training currently without the trainer beginning from epoch 1. This should avoid overwrites and allow to continue correctly in schedulers (not tested)

clementchadebec · 2023-07-18T13:00:50Z

Hi @liamchalcroft,

Sorry for the late reply. I indeed think this is a useful feature that I will integrate in the near future. Nonetheless, I was thinking of a method called resume_training_from_folder that will take as input the path to a folder containing the checkpoints of the model, the optimizer and the scheduler. It will then reload their state_dict and launch resume the training as you propose.

Add start_epoch variable to base trainer train()

c7c645a

As far as I can tell, there is no way to resume training currently without the trainer beginning from epoch 1. This should avoid overwrites and allow to continue correctly in schedulers (not tested)

liamchalcroft added 2 commits August 7, 2023 18:39

Update base_trainer.py to not re-do setup

a7a7b56

Correct last commit

dbf54df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add start_epoch variable to base trainer train() #92

Add start_epoch variable to base trainer train() #92

liamchalcroft commented Jun 12, 2023

clementchadebec commented Jul 18, 2023

Add start_epoch variable to base trainer train() #92

Are you sure you want to change the base?

Add start_epoch variable to base trainer train() #92

Conversation

liamchalcroft commented Jun 12, 2023

clementchadebec commented Jul 18, 2023