-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batchtools_slurm futures block (Was: stdout/stderr to log file only when using slurm?) #68
Comments
|
Thanks for the quick response. |
Hmm... I don't see how the above for loop finishes and you get a prompt back, which then all of a sudden blocks. There must be something else you do for the latter to occur. Do you have a reproducible example? |
This is exactly the code I tried on the slurm cluster: The template file I use is there. |
Ok. The only thing that I can see happening if it just hangs later while you're doing other things is that the garbage collector runs, which then will trigger running the garbage collector on the 10 futures you create above. Garbage collecting a batchtools future involves trying to get their result, which requires waiting the scheduler to process them. First of all, it's not clear to me if you're trying future.batchtools for the very first time and it doesn't work, or you've got it to work in the past and now it doesn't work. I recommend that you first make sure that you can create a basic future and then gets it's value, e.g. library(future.batchtools)
plan(batchtools_slurm, ...)
f <- future(42)
v <- value(f) Does that also hang? If it does, you can set Also, spawning PS. Please don't post screenshots of plain code. |
Yeah everything is working and I find {future} and {future.batchtools} very useful for my purpose (very convenient for running things on the cluster). It is just that my use might not be the intended purpose of the package. I'm just running things and saving the results in rds files, so I don't really need to return the results. And I don't want to because I don't want it to block my R session so that I can try other things in the meantime. If you tell me this is not really the purpose of the packages, then I'll look for something else when I have time. PS: Sorry about the screenshot, but I don't have a choice as I can't really get things out of the cluster easily. |
I think you are right about the Also note that my session is not blocked if the loop finishes submitting all jobs before any of them finishes. |
Yes. You should be able to see the finalizer being called if you set I'll try to give more suggestions soon-ish - need find time to do a few test runs. |
Thanks Henrik! |
Here's how you can disable the finalizer of batchtools future; library(future.batchtools)
plan(batchtools_slurm, workers = Inf, finalize=FALSE)
# Warning message:
# In tweak.future(function (expr, envir = parent.frame(), substitute = TRUE, :
# Detected 1 unknown future arguments: 'finalize' Unfortunately, you're gonna get that annoying warning (an oversight by me) but it does work. With this, the finalizer, which attempts to collect the future results and then delete them from disk, will not run. Since there is an infinite number of workers (default is 100), you will also not hit an upper limit of concurrently running futures. If you'd hit the upper limit, the next future would wait until one of the running futures have been resolved, which would require collecting the results (which you are trying to avoid). It works, but is this a good idea? I'm not sure. I say this mostly because this type of Future API use case, where you just use f <- future(..., lazy = FALSE) Having said all this, my gut feeling is that your approach should be good for a very long time. |
…d 1 unknown future arguments: 'finalize'" [#68]
Thanks for this. I don't think my use case is very uncommon; when you schedule jobs on a cluster, you usually have them as independent, running things at their pace, storing some results and leaving the computing resources when they are finished. For scheduling many jobs using e.g. a loop, {future.batchtools} is SO useful. It is just that I'm not really interested in returning the results/stdout because I'm already storing them to disk. I'll try the |
Sorry, I wasn't clear enough. I meant within the Future API ecosystem. First, the use case where I agree, there are definitely cases where you want to launch HPC jobs from R and then leave R, leave it to the user to manually poll the queue, and then continue the analyses of the produced results elsewhere. We already have Now, if we peek into the future (pun intended), there might be a day when we have a queueing framework for futures in R - a queue that does not necessary run a HPC scheduler in the background. Very very roughly, something like: q <- future_queue()
f <- future(..., lazy = TRUE)
future_submit(q, f)
v <- value(f) Maybe your use case of not caring about |
Yes the export of the globals is very convenient. |
Hi @privefl, |
@johanneskoch94 For now, I'm just making sure I stop the loop after all jobs have been submitted, but before getting any result back. |
Thanks @privefl. |
Good to hear.
Careful with |
True, I will do as you suggest. Thanks! |
I have a loop like this:
and I would to output the warning in the log files only.
I've seen the new option
split
in a blog post to do both (but I didn't find it in the documentation).The problem is that capturing this output and returning it to the main session usually blocks my session for a long time (sometimes minutes), and I would like to avoid this.
Basically, is there any way to use future/future.batchtools just to submit jobs on a slurm cluster without returning anything to the session sending them?
The text was updated successfully, but these errors were encountered: