detect forking #224

goldingn · 2018-05-09T04:58:17Z

I'm back to working on integrating future with greta.

Everything works swimmingly, except that I can't simultaneously execute tensorflow graphs in forked processes. Even if I re-import tensorflow, create a new graph etc., tensorflow just wigs out and the processes hang.

I think the best strategy for now is just to detect when the user is trying to use a forked plan, and error with suggestion they use a multisession process.

I can detect whether they've done plan(multicore) or plan(multiprocess), but I don't know how to detect whether they've something like this:

cl <- parallel::makeForkCluster(n)
plan(cluster, workers = cl)

Is there a preferred way of detecting this, or some other mechanism by which I can restrict the allowable plans?

The text was updated successfully, but these errors were encountered:

goldingn · 2018-05-09T04:59:41Z

It would also be nice, though less important, to only error if their multiprocess session is set up to fork.

HenrikBengtsson · 2018-05-11T15:28:12Z

Interesting use case. There's currently nothing in the API that supports this type of querying of details of the backend (to be) used. One hack that I could think of that you could use internally is to (disclaimer: I cannot guarantee that it'll be supported in the long term):

f <- future(NULL, lazy = TRUE)
workers <- f$workers
if (inherits(workers, "cluster")) {
  ## Worker is not yet assigned. Assume all are of the same kind; use first
  worker <- workers[[1]]
  if (inherits(worker, "forknode")) {
    stop("Parallel processing using forked processes is not supported")
  }
}

A long-term solution would be to extend the Future API with a mechanism to specify this type of requirement, e.g.

f <- future(..., resources = list(disallow = "fork"))

This type of API extension falls under the general discussion in #172.

goldingn · 2018-05-14T01:11:29Z

Nice, that should work perfectly for now - thanks!

HenrikBengtsson · 2018-05-14T03:54:49Z

Oops, it should have been lazy = TRUE to avoid triggering an actual fork, or is it ok to launch a dummy forked future?

goldingn · 2018-05-14T04:50:48Z

Should be fine either, but with lazy is tidier. Thanks!

HenrikBengtsson · 2020-01-07T22:08:31Z

More examples where forked processing with multi-threading fails badly are starting to show up.
I've created #355 to track whether the future framework can/should protect against this or not.

Maybe the problem on how to detect if we're running in a forked process or not should be addressed by R itself because the stability issue applies the 'parallel' package too. Posting to R-devel might be a good start.

HenrikBengtsson · 2020-01-11T19:07:16Z

@gaborcsardi just mentioned parallel:::isChild() in https://stat.ethz.ch/pipermail/r-devel/2020-January/078910.html and suggested to have it exported from the 'parallel' package. This function will let you know if an R process is a forked processes or not:

> parallel:::isChild()
[1] FALSE

> f <- parallel::mcparallel(parallel:::isChild())
> parallel::mccollect(f)
[1] TRUE

> cl <- parallel::makeForkCluster(1L)
> parallel::clusterEvalQ(cl, { parallel:::isChild() })
[1] TRUE

So, a more generic approach to check if a future plan is set to use forked processing (via mc*...) or not is to launch a test future:

f <- future(parallel:::isChild())
is_forked <- value(f)

This should cover more cases out of the box

Examples:

> library(future)
> f <- future(parallel:::isChild())
> value(f)
[1] FALSE

> plan(multisession, workers = 2L)
> f <- future(parallel:::isChild())
> value(f)
[1] FALSE

> plan(multicore, workers = 2L)
> f <- future(parallel:::isChild())
> value(f)
[1] TRUE

> cl <- parallel::makeForkCluster(1L)
> plan(cluster, workers = cl)
> f <- future(parallel:::isChild())
> value(f)
[1] TRUE

HenrikBengtsson · 2020-10-20T20:16:40Z

I've added futureverse/parallelly#18 for the possibility of having parallelly exporting isChild().

Closing this one.

HenrikBengtsson added this to the Future release (not next) milestone May 11, 2018

HenrikBengtsson added the feature request label May 11, 2018

HenrikBengtsson mentioned this issue Jan 7, 2020

ROBUSTNESS: Automatically disable multi-threading in forked processing ("multicore") #355

Closed

HenrikBengtsson added the base R Possibly something for base R itself label Jan 7, 2020

HenrikBengtsson mentioned this issue Jul 30, 2020

systemfonts::system_fonts crashes (segfault) when used in future r-lib/systemfonts#41

Open

HenrikBengtsson mentioned this issue Oct 20, 2020

Add isFork() futureverse/parallelly#18

Closed

HenrikBengtsson closed this as completed Oct 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

detect forking #224

detect forking #224

goldingn commented May 9, 2018

goldingn commented May 9, 2018

HenrikBengtsson commented May 11, 2018 •

edited

Loading

goldingn commented May 14, 2018

HenrikBengtsson commented May 14, 2018

goldingn commented May 14, 2018

HenrikBengtsson commented Jan 7, 2020

HenrikBengtsson commented Jan 11, 2020

HenrikBengtsson commented Oct 20, 2020

detect forking #224

detect forking #224

Comments

goldingn commented May 9, 2018

goldingn commented May 9, 2018

HenrikBengtsson commented May 11, 2018 • edited Loading

goldingn commented May 14, 2018

HenrikBengtsson commented May 14, 2018

goldingn commented May 14, 2018

HenrikBengtsson commented Jan 7, 2020

HenrikBengtsson commented Jan 11, 2020

HenrikBengtsson commented Oct 20, 2020

HenrikBengtsson commented May 11, 2018 •

edited

Loading