Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make resource IO thread-safe #49

Open
schuderer opened this issue Oct 31, 2019 · 2 comments
Open

Make resource IO thread-safe #49

schuderer opened this issue Oct 31, 2019 · 2 comments
Labels
enhancement New feature or request

Comments

@schuderer
Copy link
Owner

schuderer commented Oct 31, 2019

Right now, the main use case is to train/test a model once, then run its prediction API (and maybe occasional re-testing). The heavy load is on the API, so the model is only ever read, never updated concurrently.

But when we enable on-line learning, every prediction API call could potentially create a new, updated model. One problem here is of course performance (so some sort of delayed serialization is probably in order). Another problem is concurrency between threads (let's just implement this for threads, not processes, in the beginning!). Theoretically, the model could be updated from different threads at the same time. We need to make sure this is possible WITHOUT blocking out other api calls.

Suggestion would be to implement a pattern similar a IO queue monad. Calls to the resource module to write the new model (or read the old one for that matter) would return immediately, but not necessarily be executed immediately -- the data to update the model with would be put in a queue on which a separate thread would be chugging along, serializing the stuff that comes in (and possibly even throwing away invalid updates if the model version that has been updated has in the meantime been updated by someone else -- this is a choice -- alternatively, we can force every update to be in place, but then, API calls would have to wait for other API calls to finish).

Good luck. :)

@schuderer
Copy link
Owner Author

Same might be relevant for (some types of) data sinks.

@schuderer
Copy link
Owner Author

Related to #32

@schuderer schuderer added this to To do in Prioritized User Issues via automation May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

1 participant