Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early stopping criteria is only checked for the ReduceLROnPlateau scheduler #215

Open
mcognetta opened this issue Aug 25, 2023 · 5 comments
Labels
enhancement New feature or request work in process We are now working on this issue.

Comments

@mcognetta
Copy link
Contributor

mcognetta commented Aug 25, 2023

Describe the bug
Early stopping and patience is only considered in the ReduceLROnPlateau. For example, using bleu as an early stopping criteria and the WarmupInverseSquareRootScheduler will not stop training early, even if validation performance does not improve for more than patience iterations.

The issue is in these lines:

joeynmt/joeynmt/training.py

Lines 532 to 540 in 0968187

if self.stats.is_min_lr or self.stats.is_max_update:
break
if self.stats.is_min_lr or self.stats.is_max_update:
log_str = (f"minimum lr {self.learning_rate_min}"
if self.stats.is_min_lr else
f"maximum num. of updates {self.max_updates}")
logger.info("Training ended since %s was reached.", log_str)
break

There should be an additional stats variable checking if validation performance has not increase in N iterations.

To Reproduce
Steps to reproduce the behavior:

  1. Use any of the basic training configurations.
  2. Change the scheduling parameter to scheduling: "warmupinversesquareroot"
  3. Run the training run.
  4. Observe that training does not stop until max epochs, even if the validation performance stops improving.

Logged output
If possible, please add the training log and validation reports or excerpts of them.

Expected behavior
Training stops early when validation stops improving.

System (please complete the following information):

  • OS: Linux
  • CPU & GPU
  • Python3.10

Side note: Sorry for the influx of issues (there will be a few more in coming days). I am getting ramped up on this repo for a project of mine, so I have been doing lots of exploring. I will help to contribute fixes to anything that I report here.

@may-
Copy link
Collaborator

may- commented Sep 4, 2023

@mcognetta
Thank you for reporting!
I feel it's not a bug, but an expected behaviour.

Actually, we initially introduced the patience parameter for the plateau scheduler only.

patience: 5 # specific to plateau scheduler: wait for this many validations without improvement before decreasing the learning rate

The term patience in plateau scheduler doesn't directly control the early-stopping, but the LR reduction. Early-stopping is triggered when the LR reached the specified minimum value.

Although Joeynmt doesn't stop the iteration, it skips to save the weights, if the validation performance has not increased.

joeynmt/joeynmt/training.py

Lines 654 to 658 in 0968187

# save checkpoints
is_better = (self.stats.is_better(ckpt_score, self.ckpt_queue)
if len(self.ckpt_queue) > 0 else True)
if self.num_ckpts < 0 or is_better:
self._save_checkpoint(new_best, ckpt_score)

I'm not sure how many users wish to stop the iteration instead of just skipping the checkpoint. I'm happy to discuss further!!

@mcognetta
Copy link
Contributor Author

Hi, thanks for the background info.

IMO, the main reason one would want a validation-based early-stopping mechanism that isn't tied directly to the learning rate is when you are training several models on a corpus that you don't know a priori how long it should take to converge. For example, when I start on a new corpus, I would like to be able to just set the max_epochs to 100 and let it run until convergence (and maybe it converges in 30 epochs and I don't want to waste 70 epochs worth of compute. Tying it to learning rate works for the plateau scheduler (and, I guess in theory the inverse sqrt one, but it would take a ton of steps to decay the LR enough to stop training), but not the other schedulers. This is particularly helpful when you are running many jobs in parallel and you don't want to waste a bunch of compute on epochs that are diverging and you don't want to have to be manually inspecting the validation scores for all of them throughout training.

I know this library has different goals than fairseq (and, in fact, this is why I am so interested in this library), but it should be noted that this functionality is the default in fairseq and most other NMT packages I have used and this meaning of patience is pretty common in the literature.

I have a fork where I implemented it in just a few lines (I kept patience the same for the plateau scheduler and introduced a new config variable that implements this idea). I will open a PR if this is not a hard no (ofc, happy to discuss further).

@may-
Copy link
Collaborator

may- commented Sep 4, 2023

I understand there is such a situation and would be nice.

My concern is how essential the functionality is, though. You know, I also implement many small "nice-to-have"s, but keep them in my personal fork and don't merge them to the main branch.
To me, this sounds more like a job of a hyperparam tuning tool such as optuna (they have pruning callbacks) and not necessarily coded in official joey. But yeah, it's just my preference, maybe. If joey users are happy with it, I have of course no objection at all 😄

Thank you again for your effort and deep consideration! I really appreciate that. I'm so glad to see our joey is still growing 🐨

@mcognetta
Copy link
Contributor Author

Understood, ofc I will defer to your judgement. However, do you mind if I open (and then immediately close) a PR here linked to this issue so that others can use it in the future if they need it (adding the relevant code snippets here is a little cumbersome and it would be good for posterity)? It may also serve as a place for the discussion about if this feature is wanted by many users or not.

Thanks!

@may-
Copy link
Collaborator

may- commented Sep 15, 2023

Yes, open a PR, please! I'm more than happy for democratic decision.
Honestly, I'm sometimes afraid that this project would have some bias towards my personal preferences. Always nice to have fresh ideas and manage with many eyes. Thank you for your effort!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request work in process We are now working on this issue.
Projects
None yet
Development

No branches or pull requests

2 participants