Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine if calling future({}) will block #264

Closed
avsdev-cw opened this issue Nov 19, 2018 · 16 comments
Closed

Determine if calling future({}) will block #264

avsdev-cw opened this issue Nov 19, 2018 · 16 comments
Labels
Backend API Part of the Future API that only backend package developers rely on feature request
Milestone

Comments

@avsdev-cw
Copy link

In relation to issues #86 and #109 I believe it would be beneficial to have some mechanism of detecting if calling future({}) (or value(f) on a lazy future) would block.

This would provide a number of options of high level ways of building schedulers which wouldn't rely on the developer having to deal with number of cores/cluster size/etc to manage the queue.

Apologies if this already exists, but I didn't see anything documented/mentioned in the issues

@HenrikBengtsson
Copy link
Collaborator

HenrikBengtsson commented Dec 6, 2018

Sorry for the delay. No, correct, such a feature does not exist in the Future API. It's an interesting idea. I wonder though exactly what type of questions can be asked and what answers can be guaranteed. For simplicity, imagine we create a lazy future:

f <- future(42, lazy = TRUE)

then what would:

res <- will_it_block(f)

mean for different types of backends? We know that for a sequential future the answer is TRUE and probably the same for a multicore future where not all cores are occupied. But, if you use a future.batchtools on top of a HPC scheduler with "infinite" number of workers, what should be returned? Launching such a future will not block - it will just end up on the HPC queue but it won't really start processing until later at an unknown time (when the scheduler finds an available resource). From the perspective of the Future API and usage of it, does it even matter when the future starts? What does it even mean to wait in the queue of a scheduler - all jobs sits in the queue for some time, may it be 0.01s for some jobs and 2 hours for others. All that should matter is when it completes - or not even that - just that it returns a value - blocking or not.

Maybe the answer to your question is that the Future API is not meant to be used for implementing job schedulers.

I don't know the answer to all this and I need to digest the idea much further before coming to a conclusion. But I'm open to further discussions/thoughts.

@avsdev-cw
Copy link
Author

avsdev-cw commented Dec 7, 2018

You are pretty close to the way I imagined it. However, my thoughts of "will_it_block()" is more about whether or not it would block the process flow of the calling R script.

For example on a dual core machine running

plan(multicore)

# Futures with random sleeps to emulate long running tasks of different lengths
f1 <- future({Sys.sleep(sample(10:360, 1)); return(42)})
f2 <- future({Sys.sleep(sample(10:360, 1)); return(42)})
f3 <- future({Sys.sleep(sample(10:360, 1)); return(42)})

f3 would block until f1 or f2 finishes.

For building a simple scheduler I would want to know that calling f3 would block and therefore would either:

  • make f3 a lazy future and queue it to run later when I know f1 and/or f2 are finished;
  • store the task f3 is going to perform some other way to be created as a future when f1 or f2 finishes;
  • (worst case) return to the user to say that the system is already busy.

In my use case the view is: "I don't care when the future starts, I just want to queue it and I will wait for a complete signal/poll to check if it has finished at a later time, until then I need to go and do something else first, (eg tell the user the jobs are queued)"

Obviously there still needs to be a check at a programmer level that understands whether queues are sensible (multicore, multisession, HPC scheduler etc) or are not (single core, sequential) and to deal with those 2 cases, which they have to understand already.

@HenrikBengtsson
Copy link
Collaborator

HenrikBengtsson commented Dec 7, 2018

Obviously there still needs to be a check at a programmer level that understands whether queues are sensible (multicore, multisession, HPC scheduler etc) or are not (single core, sequential) and to deal with those 2 cases, which they have to understand already.

Related to this point, it might be that will_it_block() have to return NA when neither TRUE nor FALSE can be guaranteed by the backend.

PS. Note that, I (=my extremely conservative inner self) am still trying distill/identify exactly what the core Future API is () and I'm extremely careful in adding a new "feature" before it is well understood what that entails. One of the objectives of the Core Future API is that it should handle all use cases and be supported by all backends (existing and future ones). Other features need to be "optional" since they cannot be supported everywhere - those will have to be part of the Extended/Optional Future API. This is what Issue #172 is about. It might be that will_it_block() fits in the Core API, but not sure.

@HenrikBengtsson
Copy link
Collaborator

Forgot to say, you might have "blocking" dependencies due to communication, e.g. launching a future on a remote system may block "for a substantial period of time" due to a poor internet connection - so what does it mean that launching future will "block"?

@schloerke
Copy link

Would about a method that answers the question "is there an idle worker?" or "will this block my main R session"?

This could remove my hacky logic that future could provide generally. Ex: if (worker_is_available()) { launch_future(f) } else { try_again_later() }. Specifically: rstudio/promises#60


Currently, I am accessing the number of possible workers using the code below... which does not seem robust.

workers <- formals(future::plan("next"))$workers %||% parallelly::availableCores()

Is there a cleaner way to access how many workers are executing?

# my current / hacky solution...
worker_is_available <- function() { 
  workers <- formals(future::plan("next"))$workers %||% parallelly::availableCores()

  # reach into the named `db` list and get the last entry
  reg <- tail(names(get("db", envir = environment(FutureRegistry))), 1)
  # As the registry for all workers
  used <- length(FutureRegistry(reg, action = "list", earlySignal = FALSE))

  used < workers
}

My take on ...

But, if you use a future.batchtools on top of a HPC scheduler with "infinite" number of workers, what should be returned? Launching such a future will not block - it will just end up on the HPC queue

This situation should always return FALSE. Adding another future here will never block the main R process.

@HenrikBengtsson
Copy link
Collaborator

... formals(future::plan("next"))$workers ...

Unfortunately, this won't work if workers is a function, e.g.

plan(multisession, workers = function(...) sample(2:4, size=1))

Instead, just use:

> nbrOfWorkers()
[1] 3

Which leads to ...

Would about a method that answers the question "is there an idle worker?" or "will this block my main R session"?

Maybe a first step toward this is to extend nbrOfWorkers() so that we can do:

> nbrOfWorkers(free = TRUE)
[1] 2

Maybe that will be sufficient to get going here? Since the Future API is not really aware of the concept "worker", I'm not a big fan of having to program with such a function but we might come with a wrapper function in the future that does not mention "worker" but that can come later. Anyway, I'll try to explore if it possible to support nbrOfWorkers(free = TRUE) for all future backends.

I suspect that for plan(sequential), we should always get:

> nbrOfWorkers(free = TRUE)
[1] 0

@schloerke
Copy link

Instead, just use:

> nbrOfWorkers()
[1] 3

Great! (Sorry I missed the function 🤦 )

Would about a method that answers the question "is there an idle worker?" or "will this block my main R session"?

Maybe a first step toward this is to extend nbrOfWorkers() so that we can do:

> nbrOfWorkers(free = TRUE)
[1] 2

This should work for me! In my situation, I could just check if nbrOfWorkers(free = TRUE) > 0 to know if I should schedule another future object.

Anyway, I'll try to explore if it possible to support nbrOfWorkers(free = TRUE) for all future backends.

Thank you!

I suspect that for plan(sequential), we should always get:

> nbrOfWorkers(free = TRUE)
[1] 0

Hmmmm. I'm wondering if it should always return 1. If we are able to ask the question nbrOfWorkers(free = TRUE) with plan(sequential), then at the time of executing the function, no futures would be calculating on the main R session, meaning a single worker is free.

@HenrikBengtsson
Copy link
Collaborator

HenrikBengtsson commented Dec 2, 2020

I've just pushed branch feature/nbrOfWorkers-free with a prototype for nbrOfWorkers(free = TRUE).

I suspect that for plan(sequential), we should always get:

> nbrOfWorkers(free = TRUE)
[1] 0

Hmmmm. I'm wondering if it should always return 1. If we are able to ask the question nbrOfWorkers(free = TRUE) with plan(sequential), then at the time of executing the function, no futures would be calculating on the main R session, meaning a single worker is free.

Yes, that might be better. In the NEWS draft, I wrote:

  • Add support nbrOfWorkers(free = TRUE) to query how many workers are free to take on futures immediately. ...

So, I've just pushed an update where nbrOfWorkers() always returns 1L for sequential backends regardless of free.

I think the above is an example of why the "will-it-block" feature request is tricky. It probably depends on what you're after and why you're asking the question. I don't remember all the details, but we had discussions in the past on "asynchronousness" and whether that should be something one should be able to request/declare. I've got too little time and am too tired to revisit that right now, but it could be related to the use-cases here.

@avsdev-cw
Copy link
Author

avsdev-cw commented Dec 2, 2020

I suspect that for plan(sequential), we should always get:

> nbrOfWorkers(free = TRUE)
[1] 0

Hmmmm. I'm wondering if it should always return 1. If we are able to ask the question nbrOfWorkers(free = TRUE) with plan(sequential), then at the time of executing the function, no futures would be calculating on the main R session, meaning a single worker is free.

I would disagree. If you look at threading as a model, the idea of a "worker" is something that can be done concurrantly (and often in a totally isolated context) with the parent/controller. Counting the controller as a worker can be counter intuitive in this regards.

Take the following:

jobsToDo <- list(......)
jobsRun <- list()

while (true) {
  if (nbrOfWorkers(free = TRUE) > 0) {
    job <- jobsToDo[[1]]
    jobsToDo <- jobsToDo[[-1]]
    jobsRun <- c(jobsRun, future(job))
  }
  processEvents() # may add more items to "jobsToDo" and may process the "jobsRun" list
  # maybe a small sleep here
}

Without looking at the module code it is easy to grasp the concept that whilst there is a "worker" free, ask it to do a job and then let the main thread process any events (gui, sockets, etc). With a 1 instead of 0, theres a natural "why is there always a worker left free?".

For my money, nbrOfWorkers(free = TRUE) should return 0 on a sequential plan (as you originally surmised) leaving the coder to decide if their code flow should allow for the main R thread to be blocked or not. In my eyes it would be more acceptable for the main R process flow to continue and never do a background task than to become unresponsive processing background tasks.

@schloerke
Copy link

I agree with Henrik on maybe needing different functions (or arguments) to answer different questions as there can be multiple ways to interpret "what does blocking mean?".

In your (@avsdev-cw) example, the code is used in a way that could be interpreted as “am I able to submit a new future job without waiting for a prior future to complete?“. Which is different than “when I submit this future job, will it be resolved in my main R session?“.

I agree that plan(sequential) uses the main session while resolving the future object. But I don’t think the main R process should never be able to create/resolve future()s due to an inefficient sequential plan choice by the user.


Using the same threading model concept, shouldn't nbrOfWorkers.uniprocess() also return 0? Currently it returns 1L.

With a 1 instead of 0, theres a natural "why is there always a worker left free?"

If free means: “I am able to submit a new future job (right now) without waiting for a prior future to complete“, then the sequential plan should always be free.

@HenrikBengtsson
Copy link
Collaborator

Another related discussion that revolves around the main process of doing actual work or not is in good old Issue #7 (WISH: It should be possible to adjust the number of assigned cores)

@HenrikBengtsson
Copy link
Collaborator

HenrikBengtsson commented Dec 3, 2020

FYI, I had to replace nbrOfWorkers(free=TRUE) with nbrOfFreeWorkers().

It turned out to be quite complicated to roll out nbrOfWorkers(free = TRUE) because that would require to update the S3 generic, which currently only takes one argument to accept another argument free. But, since reverse dependencies future.batchtools and future.BatchJobs declare S3 methods for nbrOfWorkers() too, they will throw errors produce an R CMD check WARNING on S3 method / generic mismatches. It's kind of a Catch 22, I cannot update future on CRAN before these packages are updated, and vice versa. The only solution I can imagine is to ask CRAN for a temporary exception (which can slow down a submission by a week or so). It might be that I could roll out new versions of future.{batchtools,BatchJobs} that are agile to changes in future at load time/on load - not sure how easy that is to achieve. So, I decided to punt on nbrOfWorkers(free = TRUE) for now.

@HenrikBengtsson
Copy link
Collaborator

What about adding a background argument such that:

> plan(sequential)
> nbrOfFreeWorkers(background = FALSE)`
[1] 1
> nbrOfFreeWorkers(background = TRUE)`
[1] 0

and

> plan(cluster, workers = 1L)
> nbrOfFreeWorkers(background = FALSE)`
[1] 1
> nbrOfFreeWorkers(background = TRUE)`
[1] 1

?

@HenrikBengtsson HenrikBengtsson added the Backend API Part of the Future API that only backend package developers rely on label Dec 3, 2020
@avsdev-cw
Copy link
Author

This solves all the use cases I can think of. Kudos!

@HenrikBengtsson
Copy link
Collaborator

I've now merged this into the 'develop' branch, i.e. nbrOfFreeWorkers() will be part of the next release.

@HenrikBengtsson
Copy link
Collaborator

future 1.21.0 is now on CRAN with nbrOfFreeWorkers(background = ...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backend API Part of the Future API that only backend package developers rely on feature request
Projects
None yet
Development

No branches or pull requests

3 participants