Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on m:n worker mapping? #474

Closed
jdmarshall opened this issue Dec 30, 2023 · 9 comments
Closed

Thoughts on m:n worker mapping? #474

jdmarshall opened this issue Dec 30, 2023 · 9 comments
Labels

Comments

@jdmarshall
Copy link

jdmarshall commented Dec 30, 2023

We have an app that's a bit of a mess. Each request can get sucked into one of a handful of complex tasks which require CPU and local copies of shared resources. We end up running 1 process per core to improve fairness, but it seems like a titanic waste of resources, having so many copies of our code and static files loaded just to avoid CPU stalls, and every process having to arrive at the same lazily cached data set independent of each other. And even with all this we still have to add throttling code in spots to keep from creating Thundering Herd situations with other services.

I keep coming back to wanting something a bit more Erlang like, where we segregate the heaviest lifting into several distinct processes that are called by the routing code, or by each other, so that we have around 6 copies of each per server. Same number of processes, fewer caches, better peak traffic control.

I'm loathe to use sidecars for this because of the deployment nightmare this entails around change management. I think I'd be much better off with shared workers. And while I believe I know how to cheese such a thing within Piscina, by stealing and redistributing communications channels between processes, I would lose most of the API surface area in the process.

I'm wondering how thin a wrapper one could make or how much code it would take to get piscina to handle this better. It feels like a solution might be for extra behavior if piscina is itself called from a worker started by piscina, if it could negotiate with the parent process to avoid duplicate worker pools.

@metcoder95
Copy link
Member

Hey!

To understand correctly what you are suggesting, the idea is to have a Piscina instance that spawns threads that also spawn more Piscina instances with more threads that handle the load accordingly.

Theoretically, this should not be an issue; worst-case scenario you'll have threads stealing CPU time from the other, and most likely if you have it behind a synchronous service (e.g. HTTP server) it can also hit you, but I wouldn't take it for granted unless you already tested by yourself.

To do such coordination, and even maybe try to implement work-stealing, you can take a look at the following example for message communication between main and child thread.

I imagine the chore of the work will leave within the coordination of those Piscina instances, so that might be enough (?).

I keep coming back to wanting something a bit more Erlang like, where we segregate the heaviest lifting into several distinct processes that are called by the routing code, or by each other, so that we have around 6 copies of each per server. Same number of processes, fewer caches, better peak traffic control.

Yeah, that would be amazing

@jdmarshall
Copy link
Author

Let's say I start with piscina at the top level instead of the cluster module, now all of my Koa/Express/Nest instances are running as workers. And let's say I want to occasionally compile a stylesheet or rebuild a template in those, but I want to bottleneck that to say 5 tasks so that one user can't monopolize the server.

And then someone else likes this idea so much they do the same with another subsystem or 2.

I don't need to just send a task to an arbitrary worker, I need some of the fairness/load balancing that piscina and some other libraries try to provide.

The closest I could get now is to run a few services in the same docker container. Otherwise I'm converting them to full services and dealing with deployment issues and inter-version compatibility surface area. Or just putting up with the unfair system we already have because this is too much work to sell people on (people possibly being 'me')

@metcoder95
Copy link
Member

Hmm, that's interesting.

So far, currently, piscina does not offer stickiness unless you set them manually while providing your task scheduling algorithm (which can be done as shown here).

The default load balancing from Piscina is a simple FIFO, but we have seen requests to offer more algorithms out-of-the-box.

Here I imagine a weighted or least-used distribution; currently, Piscina does not offer that but can be manually implemented using a custom queue, although it misses the Worker info to understand its load.

Based on that, maybe we can start by allowing Piscina to offer that worker info to the Task Queue, and that might help you with the implementation.

But overall, it should not be a problem have a thread cluster with Piscina instances from threads.

@jdmarshall
Copy link
Author

jdmarshall commented Jan 8, 2024

the cluster module is fairly averse to the idea of swapping the LB algorithm. Having that in piscina would be another 'pro' for switching for at least some of us.

Even ignoring the bigger picture here, there would be benefit from supporting least-conn and probably random-2 for piscina. Might speed up a couple of batch processing cases that I have.

@metcoder95
Copy link
Member

Yeah, I'd see if I have any bandwidth in the upcoming weeks to kick-off something, PRs are welcome if you are interested in 🙂

@jdmarshall
Copy link
Author

jdmarshall commented Jan 10, 2024

I poked at the code, and it bears a very passing resemblance to a problem I'm solving at work with a memory leak in a caching system that is not handling publish-triggered evictions properly. I need to keep a class of data together so I can drop it as a single action instead of scanning for candidates.

Essentially you are treating all of the workers as part of a shared state and then scanning them for availability.

I think it might do as a first step if you were to segregate the workers into a busy (saturated) and available list. Because there's a lot of code in the distribution system that concerns itself specifically with that case.

Then we can talk about how one might go about using a different mechanism to pick among a list of slightly idle workers. For instance with a limit of 3, how would you maintain a fair distribution of work (not tasks) across the workers to improve response time. Particularly with tasks with variable work.

Another solution is to distribute all messages at receipt, but that would necessitate implementing work stealing. However since any mature distributed computing implementation ends up with work stealing, that may not be so bad.

That said, all of this may be at odds with my original request, depending on implementation.

@metcoder95
Copy link
Member

Yeah, the default balancing is basic, a normal FIFO until you find an available worker.
We also have support for custom queues, but it lacks the overall information about the set of workers to decide which of them distributes the load.

My idea is the following:

  1. We expose more internal data at the Custom Queues
  2. We create a bare basic load least-busy strategy that can be used out of the box.

Copy link

github-actions bot commented May 5, 2024

This issue has been marked as stale because it has been opened 30 days without activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label May 5, 2024
Copy link

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants