How to have _exactly_ 2 R processes with multicore or multisession #631
-
I'm having trouble differentiating between what constitutes a 'worker' process vs the 'calling' R process when using multicore and/or multisession (on a UNIX-like, in this case MacOS, system). Here's some setup: sleep <- Sys.sleep
pid <- Sys.getpid
now <- function() format(Sys.time(), "%H:%M:%OS6")
msg <- function(label) {
cat(sprintf("%s pid=%s now=%s\n", label, pid(), now()))
}
f <- function(label) {
msg(label)
sleep(2)
NULL
} And now (using a multicore plan, though the results are effectively the same using multisession), here's a plan(multicore, workers = 2)
f1 <- future(f("f1"))
f2 <- future(f("f2"))
msg("MAIN")
# MAIN pid=36186 now=21:27:39.603077
sleep(2)
value(f1)
# f1 pid=40137 now=21:27:39.586977
# NULL
value(f2)
# f2 pid=40138 now=21:27:39.606167
# NULL So here there are 3 R processes (pids 36186, 40137, and 40138), with "MAIN" working on neither "f1" nor "f2", nor being blocked on its own Using the same interpretation, I'd thing that setting plan(multicore, workers = 1)
f1 <- future(f("f1"))
f2 <- future(f("f2"))
msg("MAIN")
# MAIN pid=36186 now=21:30:11.007350
sleep(2)
value(f1)
# f1 pid=36186 now=21:30:06.331819
# NULL
value(f2)
# f2 pid=36186 now=21:30:08.680732
# NULL Now there's 1 R process, with all tasks running sequentially ... i.e. zero additional processes were forked. How would one go about creating a plan where there's exactly one 'background' R process? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hi. Regarding counting the main R session toward the number of workers or not, please see Issue #7 (WISH: It should be possible to adjust the number of assigned cores). As you see, this is an old topic, and in the early days, the main R session actually took up one "slot". This was then changed with the argument that it's confusing and that the main R session is most often idle or sleeping, e.g. Next, when using To launch a single background "multisession" worker, you can use: plan(cluster, workers = 1) This does not have the above For 'multicore', there's currently no way to run a single "multicore" worker. I'll think more about it, but one could imagine having a way to avoid the fallback to sequential processing by specifying something like: plan(multicore, workers = I(1))
plan(multisession, workers = I(1)) The UPDATE 2022-07-24: future (>= 1.27.0) now supports |
Beta Was this translation helpful? Give feedback.
Hi.
Regarding counting the main R session toward the number of workers or not, please see Issue #7 (WISH: It should be possible to adjust the number of assigned cores). As you see, this is an old topic, and in the early days, the main R session actually took up one "slot". This was then changed with the argument that it's confusing and that the main R session is most often idle or sleeping, e.g.
future_lapply()
launches futures or parallel workers and then poll for results once in a while, adding very little CPU load. This might of course not be true in all designs, but it's likely to be the most common one.Next, when using
workers = 1
with 'multisession' or 'multicore' the design is tha…