-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "-m <N>" option to dynamically limit jobs #1354
Conversation
Last-second manual fix.
Comedy of errors.
you can refer to a previous pull request (#660) that implemented this feature and its discussion back then. |
5826c98
to
4d8acf6
Compare
fa90577
to
fe93962
Compare
This comment was marked as abuse.
This comment was marked as abuse.
@maxim-kuvyrkov |
This would be a very useful feature to have. What will it take to get this merged? |
This comment was marked as abuse.
This comment was marked as abuse.
add119c
to
e72d1d5
Compare
Please correct me if I'm wrong but this pull request currently does not contain any functional code. Instead it's located on the limit-on-ram branch? |
Hi @skardach , I've found that for builds that can, potentially, exhaust RAM, it's much more effective to
The above approach is implemented here https://github.com/maxim-kuvyrkov/ninja/tree/limit-on-cpu , but it depends on CONFIG_HZ kernel setting, which I didn't find a public API for. Therefore I don't see how to make this approach generic enough to be included in upstream ninja. |
I'd like to note that our team has a use case where
|
@tmilev , your usecase can, potentially, be addressed with https://ninja-build.org/manual.html#ref_pool . |
@maxim-kuvyrkov Since it looks like you are already relying on the cgroup2 filesystem (/sys/fs/cgroup), perhaps you could make use of the PSI monitor files {io,cpu,memory}.pressure: https://docs.kernel.org/accounting/psi.html rather than cpuacct.stat/CONFIG_HZ? These read directly as your desired "ratio of time spent waiting for X", in addition to absolute time spent waiting, which should get you away from needing CONFIG_HZ. Pressure Stall Information is a cgroups2 thing (kernel 4.20 and up) but it's easily detectable by whether the *.pressure files exist. Or for the extra-credits solution, you can even set up your own desired triggers that can be monitored via select/poll() (https://docs.kernel.org/accounting/psi.html#monitoring-for-pressure-thresholds), which you could even plumb into SubprocessSet::DoWork to integrate something similar to https://github.com/tobixen/thrash-protect (using SIGSTOP to temporarily pause jobs you've already launched, letting ninja recover gracefully if turns out ninja has already launched too many new jobs that all turned out to be large, and pushed the cgroup into thrashing). Then use SIGCONT to resume a paused job (instead of launching a new one) as non-paused jobs finish and the pressure gets better. |
Hi @puetzk , thanks for the suggestion! I have implemented another approach (see [1] and [2]), which allows for several containers to gracefully compete for limited RAM. Your suggestion applies equally well to that new approach. The idea in [1] and [2] is that we increase parallelism until we run into CPU waiting. The CPU waiting can come from direct CPU share limit (our container / cgroup used up its fair share of CPU cycles) or from RAM allowance getting exhausted, and we have started to use swap. While swapping we are "waiting" for CPU just like when hitting CPU share limit. This new approach allows ninja to dance on the edge of swapping when there is high demand for RAM from other containers, while increasing parallelism to the maximum when other containers aren't using RAM. [1] maxim-kuvyrkov@70ef9be . |
Dropping this pull request in favour of https://github.com/maxim-kuvyrkov/ninja/commits/limit-on-cpu . |
Yeah, my suggested related more to your limit-on-cpu branch, but this was the thread where it was getting talked about. |
That seemed like the solution to your comment about
|
The implementation in #2300 should work for non-containerized environments as well. |
... on memory threshold. So far only Unix-style OS are supported.
No functional change for other configurations.