Persistent workers for the *apply backends #289

wlandau · 2018-02-28T13:18:11Z

I tried this once before and failed, but from what I learned from #227, I think it is possible after all. The master process can communicate with workers with a special workers namespace in the cache. If we succeed, we may not need a callr backend (#278), though we should keep the existing schedule for the future backend for cases where workers do not have cache access.

The text was updated successfully, but these errors were encountered:

wlandau · 2018-02-28T13:18:52Z

Best part: we could totally get rid of staged parallelism this way and be left with a minimal-overhead solution.

krlmlr · 2018-02-28T13:33:16Z

I thought a bit about message passing. We don't need very elaborate functionality, just post, receive, and wait for R objects. On the other hand, a storr namespace will require file system access for all workers. I wonder if we could use an established library like MPI for sending commands to workers and receiving results.

wlandau · 2018-02-28T13:40:48Z

Message passing is certainly the appropriate paradigm here. Would posting allow us to send entire targets to master? Otherwise, I think the workers need cache access anyway.

wlandau · 2018-02-28T13:50:22Z

All the *apply backends already assume cache access, so that is something else we may need to fix.

krlmlr · 2018-02-28T14:03:58Z

Even if the worker reads a file created by the master or by some other worker, it can be viewed as a form of "posting" a message. A message can be a blob of arbitrary size. We just don't want to block the master until a worker has read the data, which is why I used this term.

The master posts a job description (command + inputs) and assigns it to a worker.
A worker receives the data, does its job and posts the reply back to the master.
The master receives the reply and posts a new job.
Occasionally, worker or master need to wait for new data to receive.

MPI should be able to handle this, I wonder if that's the best solution though.

wlandau · 2018-02-28T14:16:15Z

The master receives the reply...

Is the master receiving the value of the target itself?

MPI should be able to handle this, I wonder if that's the best solution though.

Yeah, it seems like Rmpi might be its own separate backend if we go that direction. I am not sure how mclapply() workers, for example, would be able to take advantage of MPI-style message passing.

wlandau · 2018-02-28T15:26:00Z

Another thing: before each target is built, the environment should be pruned in order to make sure dependencies are loaded and targets we don't need anymore are unloaded. In order to know what we can safely unload, each worker needs to know which targets the other workers are building. This information is easy to communicate using the file system, and it should also be possible with message passing.

wlandau · 2018-02-28T15:28:47Z

Closing because I think we should move this thread to #285. Persistence is a whole new scheduling paradigm for drake, and I think it is an excellent opportunity to begin a separate scheduling package.

wlandau self-assigned this Feb 28, 2018

wlandau added difficulty: advanced topic: performance status: priority labels Feb 28, 2018

wlandau removed their assignment Feb 28, 2018

wlandau mentioned this issue Feb 28, 2018

a separate package for drake's job scheduling #285

Closed

wlandau closed this as completed Feb 28, 2018

wlandau mentioned this issue Feb 28, 2018

callr backend #278

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistent workers for the *apply backends #289

Persistent workers for the *apply backends #289

wlandau commented Feb 28, 2018

wlandau commented Feb 28, 2018

krlmlr commented Feb 28, 2018

wlandau commented Feb 28, 2018 •

edited

Loading

wlandau commented Feb 28, 2018

krlmlr commented Feb 28, 2018

wlandau commented Feb 28, 2018

wlandau commented Feb 28, 2018

wlandau commented Feb 28, 2018

Persistent workers for the *apply backends #289

Persistent workers for the *apply backends #289

Comments

wlandau commented Feb 28, 2018

wlandau commented Feb 28, 2018

krlmlr commented Feb 28, 2018

wlandau commented Feb 28, 2018 • edited Loading

wlandau commented Feb 28, 2018

krlmlr commented Feb 28, 2018

wlandau commented Feb 28, 2018

wlandau commented Feb 28, 2018

wlandau commented Feb 28, 2018

wlandau commented Feb 28, 2018 •

edited

Loading