-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epic: Enable overcommit #517
Comments
Here's some design ideas I've considered. I believe the 3rd solution is probably the best one. 1. Implement exclusively within k8s scheduler pluginIdea: Change the scheduler plugin so that it actually allows resource usage from Good parts:
Bad parts:
2. Implement within neonvm-controller/neonvm-runnerIdea: When a VM is set to a certain amount of memory and/or CPU, we actually set QEMU to use some fixed multiple of that Good parts:
Bad parts:
3. Implement via "overcommit" factor per VMIdea: Add a new VM setting determining an "overcommit" factor that both the scheduler and cluster-autoscaler respect. Kind of a combination of ideas 1 and 2, but per-VM rather than global. Good parts:
Bad parts:
|
We discussed this today and decided to start increasing the overcommit factor gradually over the course of the next few weeks by .1 increments. We will observe if there are any negative effects and probably ramp up to 1.5, only. For more we will prioritize neondatabase/cloud#14114 as a prerequisite to allow for faster reaction in case of node failures. |
Made an initial implementation in #905, still need to test it and self-review. Opened a handful of other PRs while I was poking around in the area: |
Motivation
Look at the image:
Let's have at least 2x overcommit (because of the maximum number of pods that can be up and running)
DoD
Implementation ideas
TODO
Tasks
Tasks
Other related tasks, Epics, and links
The text was updated successfully, but these errors were encountered: