Future for R packages developers #456
Replies: 1 comment 3 replies
-
First, note that the call: oplan <- plan(multisession) will add overhead, because it launches the background R sessions. The more workers it launches, where Other than that, "it should work". You can convince yourself that things are indeed running in parallel with the number of workers that you'd expect by adding a bit of debug/verbose output, e.g. my_fcn <- function(x, parallel = FALSE) {
if (parallel) {
oplan <- plan(multisession)
on.exit(plan(oplan))
message("Number of workers: ", nbrOfWorkers())
y <- future_lapply(x, FUN = function(z) {
message("Worker PID: ", Sys.getpid())
analyze(z)
}) ## from future.apply package
} else {
y <- lapply(x, FUN = analyze)
}
summarize(y)
} Call y <- my_fcn(x, parallel = TRUE) and you should see the debug output. You should expect one unique 'Worker PID' per worker, i.e. per core. |
Beta Was this translation helpful? Give feedback.
-
Dear Henrik,
I am developing an R package and I want to parallelize and/or distribute some processes within the main function using your "future" package.
As you recommend in a vignette (https://cran.r-project.org/web/packages/future/vignettes/future-7-for-package-developers.html) I have to avoid adding the plan() within the main function as follows:
my_fcn <- function(x, parallel = FALSE) {
if (parallel) {
oplan <- plan(multisession)
on.exit(plan(oplan))
y <- future_lapply(x, FUN = analyze) ## from future.apply package
} else {
y <- lapply(x, FUN = analyze)
}
summarize(y)
}
I have a similar function in my package, which is internally called by the main function. When I run this function outside the main function it performs efficiently. However, if I include this function within the main function it takes too long (it seems that R is not using all the threads).
I would be very grateful if you could provide me some solutions.
Erick.
Beta Was this translation helpful? Give feedback.
All reactions