-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thoughts on m:n worker mapping? #474
Comments
Hey! To understand correctly what you are suggesting, the idea is to have a Piscina instance that spawns threads that also spawn more Piscina instances with more threads that handle the load accordingly. Theoretically, this should not be an issue; worst-case scenario you'll have threads stealing CPU time from the other, and most likely if you have it behind a synchronous service (e.g. HTTP server) it can also hit you, but I wouldn't take it for granted unless you already tested by yourself. To do such coordination, and even maybe try to implement work-stealing, you can take a look at the following example for message communication between main and child thread. I imagine the chore of the work will leave within the coordination of those Piscina instances, so that might be enough (?).
Yeah, that would be amazing |
Let's say I start with piscina at the top level instead of the cluster module, now all of my Koa/Express/Nest instances are running as workers. And let's say I want to occasionally compile a stylesheet or rebuild a template in those, but I want to bottleneck that to say 5 tasks so that one user can't monopolize the server. And then someone else likes this idea so much they do the same with another subsystem or 2. I don't need to just send a task to an arbitrary worker, I need some of the fairness/load balancing that piscina and some other libraries try to provide. The closest I could get now is to run a few services in the same docker container. Otherwise I'm converting them to full services and dealing with deployment issues and inter-version compatibility surface area. Or just putting up with the unfair system we already have because this is too much work to sell people on (people possibly being 'me') |
Hmm, that's interesting. So far, currently, piscina does not offer stickiness unless you set them manually while providing your task scheduling algorithm (which can be done as shown here). The Here I imagine a weighted or least-used distribution; currently, Piscina does not offer that but can be manually implemented using a custom queue, although it misses the Worker info to understand its load. Based on that, maybe we can start by allowing Piscina to offer that worker info to the Task Queue, and that might help you with the implementation. But overall, it should not be a problem have a thread cluster with Piscina instances from threads. |
the cluster module is fairly averse to the idea of swapping the LB algorithm. Having that in piscina would be another 'pro' for switching for at least some of us. Even ignoring the bigger picture here, there would be benefit from supporting least-conn and probably random-2 for piscina. Might speed up a couple of batch processing cases that I have. |
Yeah, I'd see if I have any bandwidth in the upcoming weeks to kick-off something, PRs are welcome if you are interested in 🙂 |
I poked at the code, and it bears a very passing resemblance to a problem I'm solving at work with a memory leak in a caching system that is not handling publish-triggered evictions properly. I need to keep a class of data together so I can drop it as a single action instead of scanning for candidates. Essentially you are treating all of the workers as part of a shared state and then scanning them for availability. I think it might do as a first step if you were to segregate the workers into a busy (saturated) and available list. Because there's a lot of code in the distribution system that concerns itself specifically with that case. Then we can talk about how one might go about using a different mechanism to pick among a list of slightly idle workers. For instance with a limit of 3, how would you maintain a fair distribution of work (not tasks) across the workers to improve response time. Particularly with tasks with variable work. Another solution is to distribute all messages at receipt, but that would necessitate implementing work stealing. However since any mature distributed computing implementation ends up with work stealing, that may not be so bad. That said, all of this may be at odds with my original request, depending on implementation. |
Yeah, the default balancing is basic, a normal FIFO until you find an available worker. My idea is the following:
|
This issue has been marked as stale because it has been opened 30 days without activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
We have an app that's a bit of a mess. Each request can get sucked into one of a handful of complex tasks which require CPU and local copies of shared resources. We end up running 1 process per core to improve fairness, but it seems like a titanic waste of resources, having so many copies of our code and static files loaded just to avoid CPU stalls, and every process having to arrive at the same lazily cached data set independent of each other. And even with all this we still have to add throttling code in spots to keep from creating Thundering Herd situations with other services.
I keep coming back to wanting something a bit more Erlang like, where we segregate the heaviest lifting into several distinct processes that are called by the routing code, or by each other, so that we have around 6 copies of each per server. Same number of processes, fewer caches, better peak traffic control.
I'm loathe to use sidecars for this because of the deployment nightmare this entails around change management. I think I'd be much better off with shared workers. And while I believe I know how to cheese such a thing within Piscina, by stealing and redistributing communications channels between processes, I would lose most of the API surface area in the process.
I'm wondering how thin a wrapper one could make or how much code it would take to get piscina to handle this better. It feels like a solution might be for extra behavior if piscina is itself called from a worker started by piscina, if it could negotiate with the parent process to avoid duplicate worker pools.
The text was updated successfully, but these errors were encountered: