plan(multicore) : <FutureError: Failed to retrieve the result of MulticoreFuture (<none>) from the forked worker (on localhost; PID 71). Post-mortem diagnostic: No process exists with this PID, i.e. the forked localhost worker is no longer alive.> #674
Replies: 2 comments 4 replies
-
Hello. From the error:
it's very likely that the parallel worker process crash completely. The next step is to figure out what in your R expression causes this to happen. I suspect this is related to the future package per se. Note that 'multicore' uses forked processes for parallelization. Forked parallelization is risky business, and not all R code and underlying software support that. Sometimes it always fails, sometimes it's more subtle where it only crashes "randomly". It could very well be that tensorflow, which appears to be involved here, is not fork-safe. If that's the case, you should be able to reproduce the problem using My recommendation would be to reach out to the maintainers to ask if tensorflow is fork safe. Another, is to try with |
Beta Was this translation helpful? Give feedback.
-
Hi Henrik. I reached out to tensorflow author and received an reply too. Here is the link for your reference. He mentioned that tensorflow is not fork-safe either from R or Python |
Beta Was this translation helpful? Give feedback.
-
Hi HenrikBengtsson
I am using future's "multicore" for my application's asynchronous call. It is an model that processes the text and classifies into different labels using keras and tensorflow libraries with help of reticulate package. As the model works on linux based system, I am using "multicore" process. The model starts with multicore process and ends with plan(sequential) to kill the forked processes.
This application basically does the modelling and post the messages to azure eventhub. When the application is triggered once at a time, there is no error observed from the future set. But when the application is triggered multiple times back to back, few transactions are succeeding as expected and few are failing and few are getting lost too. Transaction's failure/success messages are received at the eventhub side and no evidence of the lost messages at eventhub side.
Describe the bug
Scenario 1 : When multiple transactions are triggered next to next : following logs are given
eventhub message when the above future error occurs :
These errors does not correlate to my src,
When the triggered transactions are lost in the future call, event hub is not having any messages at all (it doesnot reach eventhub).
Scenario 2 : When multisession used,
event message produced for above trigger was :
Reproduce example
print("message to DB for reading data")
library(future)
future::plan(multicore, workers = availableCores())
print("printing availableCores")
print(availableCores())
print("ps cpu count")
print(ps::ps_cpu_count())
options(future.debug = TRUE)
options(future.globals.maxSize = 768*1024^2) ## setting up global vars memory size-can change according i/p & o/p limit
options(future.seed = TRUE)
print("future started - data read calling")
source("dbReadData.R", local = T)
future(dataRead(lifecycle_name), lazy = FALSE, globals = TRUE)
future::plan(sequential) ## to kill the multicore forked processes
Expected behavior
When only one transaction is triggered at a time : following logs are given as expected
event hub messages during success scenario :-
Session information
R Session Info :-
Beta Was this translation helpful? Give feedback.
All reactions