Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Submission rate too high" with a large future_lapply #13

Closed
kendonB opened this issue Oct 31, 2017 · 7 comments
Closed

"Submission rate too high" with a large future_lapply #13

kendonB opened this issue Oct 31, 2017 · 7 comments

Comments

@kendonB
Copy link

kendonB commented Oct 31, 2017

My SLURM system got upset when submitting a large number of jobs:

Error in batchtools::submitJobs(reg = reg, ids = jobid, resources = resources) :
  Fatal error occurred: 101. Command 'sbatch' produced exit code 1. Output: 'sbatch: error: Submission rate too high, suggest using job arrays
sbatch: error: Batch job submission failed: Unspecified error'

Perhaps one could solve this with an interface to the sleep option in batchtools::submitJobs?

@wlandau-lilly
Copy link

Apparently, there is a way to restrict the maximum number of jobs running at a time. It will probably be a SLURM environment variable. You might look at ?future.options.

This is why drake uses the jobs argument to set the maximum number of simultaneous jobs. Unfortunately, it does not apply to future_lapply.

@HenrikBengtsson
Copy link
Collaborator

What's missing

Internally, batchtools::submitJobs() is used. It takes an argument sleep. It's help says:

If not provided (NULL), tries to read the value (number/function) from the configuration file (stored in reg$sleep) or defaults to a function with exponential backoff between 5 and 120 seconds.

I'm sure what "exponential backoff between 5 and 120 seconds" really means. @mllg, does this mean that the sleep time grows exponentially from a minimum 5 seconds to a maximym 120 seconds between jobs?

Now, future.batchtools does not support specifying this sleep argument (so it uses the default). I've added a FR #14 for this.

@wlandau-lilly, I have to think more about if future_lapply() should do have a future.max.futures.at.any.time-ish argument or that should/could be control elsewhere. I haven't though about it much before so I don't have a good sense right now. (Related to futureverse/future#159 and possibly also to futureverse/future#172).

Workaround for now: Control via load balancing

future_lapply() will "distributed" the N tasks to all K workers it knows of. For workers on a HPC scheduler, then default is K=+Inf. Because of this, it will distribute N tasks to N workers, that is, one task per worker, which is equivalent to one task per submitted job. In other words, if N is very large, future_lapply() may hit the scheduler too hard when using plan(batchtools_slurm).

If you look at ?batchtools_slurm you'll see argument workers which defaults to workers = Inf. (I do notice it is poorly documented/described). If you use, plan(batchtools_slurm, workers = 200), then future_lapply() will resolve all tasks using K = 200 jobs. This means that each job will do single-core processing of N/K tasks.

Comment: The main rational for the workers argument for batchtools_nnn backends is that even if you could submit N single-task jobs, the overhead of launching each jobs is so high that the total overhead of launching jobs will significantly dominate the overall processing time.

@kendonB
Copy link
Author

kendonB commented Nov 13, 2017

Original comment: To update, I have been happily using workers = N to work around this problem. The highest I've tried is workers = 500 and it worked fine.

Updated comment: The original version of this comment was plain wrong. The error just hadn't shown up. 500 seems to fail, 300 seems to fail, 200 seems to work fine. Even when sending more than 200, a bunch of jobs do start and, since drake is in charge, those resources aren't wasted.

@wlandau-lilly
Copy link

wlandau-lilly commented Nov 14, 2017

@HenrikBengtsson from drake's point of view, this so-called "workaround" is actually an ideal solution in its own right. Here, imports and targets are parallelized with different numbers of workers, which is the right approach for distributed parallelism.

library(drake)
library(future.batchtools)
future::plan(batchtools_local(workers = 8))
# 4 jobs for imports, 8 jobs for targets:
make(my_plan, parallelism = "future_lapply", jobs = 4)

I will recommend this approach in the documentation shortly.

@mllg
Copy link

mllg commented Nov 14, 2017

Apparently, there is a way to restrict the maximum number of jobs running at a time. It will probably be a SLURM environment variable. You might look at ?future.options.

Yes. It was buried in the configuration, but you can also control it via setting the resource max.concurrent.jobs in the next version.

I'm sure what "exponential backoff between 5 and 120 seconds" really means. @mllg, does this mean that the sleep time grows exponentially from a minimum 5 seconds to a maximym 120 seconds between jobs?

Exactly. The sleep time for iteration i is calculated as:

5 + 115 * pexp(i - 1, rate = 0.01)

But note that I discovered a bug lately so that the there was no sleeping at all 😞
This is fixed in the devel version which I plan to release this week.

There is currently no support for controlling the submission rate. I could however use the reported error message and treat the error as a temporary error which then automatically leads to the described sleep mechanism in submitJobs().

@kendonB
Copy link
Author

kendonB commented Nov 25, 2017

This problem appears to be solved with the latest version of batchtools. Feel free to close.

@HenrikBengtsson HenrikBengtsson added this to the Next release milestone Apr 12, 2020
@HenrikBengtsson
Copy link
Collaborator

Related to this issue: I've changed the default number of workers on HPC schedulers from +Inf to 100 in the next release (commit 1a547d9). The default can be set via an option or env var.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants