Do not consider preemption.borrowWithinCohort in FairSharing preemptions #4165

gabesaba · 2025-02-06T14:50:21Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

preemption.borrowWithinCohort is being used to override dominantResourceShare, causing preemptions which do not match scheduling order, resulting in the bug described here. We do not document preemption.borrowWithinCohort as applying to FairSharing, but only the other two preemption policies - link.

Which issue(s) this PR fixes:

Fixes #3779

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Update FairSharing to be incompatible with ClusterQueue.Preemption.BorrowWithinCohort. Using these parameters together is a no-op, and will be validated against in future releases. This change fixes an edge case which triggered an infinite preemption loop when these two parameters were combined.

netlify · 2025-02-06T14:50:45Z

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

Name	Link
🔨 Latest commit	`8390d77`
🔍 Latest deploy log	https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/67b48286196fd20008d961ba

tenzen-y · 2025-02-06T15:55:48Z

/cc

mimowo · 2025-02-06T16:32:24Z

/hold
For determining if we want to support the preemption.borrowWithinCohort along with FairSharing, see #3779 (comment), as this PR makes the preemption.borrowWithinCohort setting no-op.

If we decide the combination is useful we may need to re-write the PR. If we decide the combination is not useful I would like to follow it up with validation which prevents combining the settings.

tenzen-y · 2025-02-06T17:40:00Z

/hold For determining if we want to support the preemption.borrowWithinCohort along with FairSharing, see #3779 (comment), as this PR makes the preemption.borrowWithinCohort setting no-op.

If we decide the combination is useful we may need to re-write the PR. If we decide the combination is not useful I would like to follow it up with validation which prevents combining the settings.

One more consideration: this PR tries to break the current FairSharing feature specification (before this PR, the borrowWithinCohort could be involved in the FairSharing decision). Ideally, we want to looking for another solution to resolve #3779. If we decide to go with this solution, we might want to prepare any mitigation way since this already has been published as Beta feature.

tenzen-y

Changes look great to me. The next step is decision whether we should take this way.

mimowo · 2025-02-07T07:25:47Z

I gave it another round of thinking.

Actually, I think the root cause for the infinite loop is that we do belowThreshold || strategy, because then the low priority workloads can bypass the strategy.

So, I have another idea to turn this check to just strategy, but filter out workloads above the threshold in findCandidates when fairSharing is enabled. I believe the idea has also a relatively low implementation cost.

This will make sure no workloads bypass the strategy, yet we require the target workloads to be below the priority threshold. I think it is likely a valid usecase to safe-guard high priority workloads from fair sharing, and the field allows for that.

I think the idea of the preemption.borrowWithinCohort.maxPriorityThreshold was to safe-guard high priority workloads from the preemption.

EDIT: I would be happy to move forward with the approach in this comment (unless you see some counter-examples) and backport it, regardless of the long-term plans of combining the configurations.

mimowo · 2025-02-07T07:25:57Z

WDYT @gabesaba @tenzen-y ?

tenzen-y · 2025-02-07T11:25:01Z

So, I have another idea to turn this check to just strategy, but filter out workloads above the threshold in findCandidates when fairSharing is enabled. I believe the idea has also a relatively low implementation cost.

My concern is making fairSharing complicated since the idea indicates cooperation between borrowWithinCohort.maxPriorityThreshold and fairSharing.
As an alternative approach, we might be able to add a similar filtering mechanism as borrowWithinCohort.maxPriorityThreshold to fairSharing, and then make borrowWithinCohort.maxPriorityThreshold and fairSharing as mutually exclusive.

But, I'm not honestly sure if whieh solution can be understandable for users. So, here, it might be better to proceed with your (@mimowo) approach, and then consider the above approach (fairSharing with filter mechanism) based on user feedback.
That is a more low-risk approach, IMO.

gabesaba · 2025-02-07T11:28:52Z

I think it is likely a valid usecase to safe-guard high priority workloads from fair sharing, and the field allows for that.

This protection is defined on the preempting CQ's side. Which means, for these high priority workloads to be safe, this setting needs to be applied to all CQs. I think a better interface would be preemption protection defined by the CQ that wants to protect its workloads. Or, for that CQ to simply not borrow for a flavor it doesn't want to risk preemptions in.

So, I have another idea to turn this check to just strategy, but filter out workloads above the threshold in findCandidates when fairSharing is enabled. I believe the idea has also a relatively low implementation cost.

This still leads to weird behavior when the workload wants to do reclaimWithinCohort. With this proposal, we would filter valid reclaimWithinCohort targets, so that the semantics no longer match non-fair sharing reclamation. We could add a condition to do reclaim when not borrowing [1][2] - but I'm opposed to introducing more complexity into an already complex algorithm. I think we should leave these decisions to the result of the fair share value, to to keep the complexity manageable.

notes:
[1] but adds even more complexity, as definition of borrowing is complex in the hierarchical case #3948 (comment)
[2] plus the CQ could stop borrowing once it preempts some of its own workloads

mimowo · 2025-02-07T11:29:55Z

As an alternative approach, we might be able to add a similar filtering mechanism as borrowWithinCohort.maxPriorityThreshold to fairSharing, and then make borrowWithinCohort.maxPriorityThreshold and fairSharing as mutually exclusive.

That was also my thought, but in a way FairSharing is already aware of other candidate filtering configurations based on preemption.reclaimWithinCohort, so respecting also preemtion.borrowWithinCohort.maxPriorityThreshold seems justified.

mimowo · 2025-02-07T11:55:14Z

we would filter valid reclaimWithinCohort targets, so that the semantics no longer match non-fair sharing reclamation

We already use preemption.reclaimWithinCohort for filtering of candidates later considered by FairSharing algorithm based on relative priority of preemptor and candidate. Considering preemption.borrowWithinCohort when fair sharing enabled just adds another constraint on the set of candidates based on absolute priority rather than the relative one.

We could add a condition to do reclaim when not borrowing [1][2]

The notes are valid, and I agree this approach leads to some sub-optimal decisions given the dynamic nature of "borrowing" as we progress through candidates. However:

the algorithm as in master does not make the check dynamic [2] based on the current needs of preemption, so there would be no regression in this regard
it eliminates the infinite loop while allowing to put cap on the priority of the preempted workloads (if one has such a business use case).
it lowers the complexity of the algorithm anyway as the filtering is moved to findCandidates which is easier to understand

but I'm opposed to introducing more complexity into an already complex algorithm

Depends how you look at it. IMO, the complexity of would be anyway lower compared to the current "main" branch, because we would essentially commit your changes inside the fairPreemptions, but move the complexity to the findCandidates. So, the main part of the algorithm would get simplified anyway.

In the end - I'm fine with the approach proposed in this PR as long as we confirm somehow this does not break any business use cases, but I'm worried confirming this might be difficult in practice given the OSS nature of the software. When we disable configuration it is generally better to go over some deprecation (feature-gate) process, and short term has something which can be cherry-picked as bugfixing with marginal risk of breaking.

gabesaba · 2025-02-07T16:51:52Z

Discussed and reached consensus with @mimowo offline

The infinite preemption issue (which this PR fixes), and whether we limit preemptions to a threshold priority (potentially how users expect this parameter to work with FairSharing), are separate issues. Since the latter didn't exist in FairSharing in 0.10.1 anyway, we will fix it separately, ensuring that the new API makes sense. I opened #4173 with a proposal.

I disagree with the approach of using preemption.borrowWithinCohort with FairSharing, as it results in opposite semantics in Classical Preemption vs Fair Sharing. See argument in #4173

As an alternative approach, we might be able to add a similar filtering mechanism as borrowWithinCohort.maxPriorityThreshold to fairSharing, and then make borrowWithinCohort.maxPriorityThreshold and fairSharing as mutually exclusive.

@tenzen-y Please take a look at #4173. I think we can add a threshold in reclaimWithinCohort to accomplish this

mimowo · 2025-02-10T10:43:22Z

Yes, I agree that supporting the borrowWithinCohort with FairSharing would be confusing. For example, my proposal (of filtering the candidates based on the threshold regardless of borrowing of the preemptor) leads also to a problematic behavior in some cases when the preemptor does not need to borrow as the algorithm progresses.

So, I'm happy to just make it no-op.

However, I would like to update the documentation for the preemption.borrowWithingCohort field to say: "it is not supported when fair sharing is enabled", or "the behavior when fair sharing is enabled is unspecified. The current implementation ignores the field". We may also update some other relevant places in the documentation.

Ideally, we would also have some warnings in webhooks and on Kueue startup, but if this requires substantial ground work then I'm ok with just documentation update.

tenzen-y · 2025-02-10T11:00:20Z

Ideally, we would also have some warnings in webhooks and on Kueue startup, but if this requires substantial ground work then I'm ok with just documentation update.

I think that it could be possible, easily. I guess that we can just pass the Kueue Configuration to the CQ webhook server, and hold it in

kueue/pkg/webhooks/clusterqueue_webhook.go

Line 43 in 3755065

type ClusterQueueWebhook struct{}

.

The validation could be performed when CQ is created or updated.

gabesaba · 2025-02-17T11:20:33Z

a2a965f to 1d96117 - rebase
1d96117 to 28653b2 - documentation + webhook validation

@mimowo @tenzen-y

gabesaba · 2025-02-17T11:49:10Z

Ideally, we would also have some warnings in webhooks and on Kueue startup, but if this requires substantial ground work then I'm ok with just documentation update.

I think that it could be possible, easily. I guess that we can just pass the Kueue Configuration to the CQ webhook server, and hold it in

kueue/pkg/webhooks/clusterqueue_webhook.go

Line 43 in 3755065

type ClusterQueueWebhook struct{}

.
The validation could be performed when CQ is created or updated.

Added this validation. I only tested in unit tests, not in integration tests, as it would have required some refactoring of webhook integration tests to pass this new configuration. I can update these tests if you think is worth the effort.

k8s-ci-robot · 2025-02-17T12:14:01Z

@gabesaba: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-kueue-test-customconfigs-e2e-main	`a2a965f`	link	true	`/test pull-kueue-test-customconfigs-e2e-main`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

mimowo · 2025-02-17T13:46:31Z

The strong validation makes me feel uncomfortable - I'm thinking about the migration from classical preemption to fair sharing.

Consider the two migration options for users who use borrowWithinCohort:

enable fair sharing then adjust CQs; the requests to update CQ will continuously fail until the CQs are updated
adjust CQs then enable fair sharing; there is no preemption while borrowing until fair sharing is enabled

Also, there is no easy way to experiment (enable / disable fair sharing to compare the impact). It will require scripting the enablement and disablement procedures to make it quick. I think ideally the validation would be "soft" warnings rather than "hard".

I would suggest splitting the PR into two:

logic & docs changes
validation

So that we can make the decision independently. I'm convinced we want to cherry-pick 1, but (2.) makes me feel less comfortable. I think ideally this is not strong validation, but soft. We have two types of validation in the core k8s, see:

warnings, errors.

If you @tenzen-y and @gabesaba think strong validation is ok, I'm good to go, but even then prefer two PRs to be able to easily bailout (revert) just the validation.

gabesaba · 2025-02-18T13:01:07Z

The strong validation makes me feel uncomfortable - I'm thinking about the migration from classical preemption to fair sharing.

Consider the two migration options for users who use borrowWithinCohort:

enable fair sharing then adjust CQs; the requests to update CQ will continuously fail until the CQs are updated

adjust CQs then enable fair sharing; there is no preemption while borrowing until fair sharing is enabled

Also, there is no easy way to experiment (enable / disable fair sharing to compare the impact). It will require scripting the enablement and disablement procedures to make it quick. I think ideally the validation would be "soft" warnings rather than "hard".

I would suggest splitting the PR into two:

logic & docs changes

validation

So that we can make the decision independently. I'm convinced we want to cherry-pick 1, but (2.) makes me feel less comfortable. I think ideally this is not strong validation, but soft. We have two types of validation in the core k8s, see:

warnings, errors.

If you @tenzen-y and @gabesaba think strong validation is ok, I'm good to go, but even then prefer two PRs to be able to easily bailout (revert) just the validation.

Split out to #4298

Let's discuss there whether we add soft or strong validation, and whether or not to cherry-pick that change

tenzen-y · 2025-02-18T14:39:50Z

The strong validation makes me feel uncomfortable - I'm thinking about the migration from classical preemption to fair sharing.

Consider the two migration options for users who use borrowWithinCohort:

enable fair sharing then adjust CQs; the requests to update CQ will continuously fail until the CQs are updated

adjust CQs then enable fair sharing; there is no preemption while borrowing until fair sharing is enabled

Also, there is no easy way to experiment (enable / disable fair sharing to compare the impact). It will require scripting the enablement and disablement procedures to make it quick. I think ideally the validation would be "soft" warnings rather than "hard".

I would suggest splitting the PR into two:

logic & docs changes

validation

So that we can make the decision independently. I'm convinced we want to cherry-pick 1, but (2.) makes me feel less comfortable. I think ideally this is not strong validation, but soft. We have two types of validation in the core k8s, see:

warnings, errors.

If you @tenzen-y and @gabesaba think strong validation is ok, I'm good to go, but even then prefer two PRs to be able to easily bailout (revert) just the validation.

I fully agree with you.

gabesaba · 2025-02-18T14:44:04Z

The strong validation makes me feel uncomfortable - I'm thinking about the migration from classical preemption to fair sharing.
Consider the two migration options for users who use borrowWithinCohort:

enable fair sharing then adjust CQs; the requests to update CQ will continuously fail until the CQs are updated

adjust CQs then enable fair sharing; there is no preemption while borrowing until fair sharing is enabled

Also, there is no easy way to experiment (enable / disable fair sharing to compare the impact). It will require scripting the enablement and disablement procedures to make it quick. I think ideally the validation would be "soft" warnings rather than "hard".
I would suggest splitting the PR into two:

logic & docs changes

validation

So that we can make the decision independently. I'm convinced we want to cherry-pick 1, but (2.) makes me feel less comfortable. I think ideally this is not strong validation, but soft. We have two types of validation in the core k8s, see:

warnings, errors.

If you @tenzen-y and @gabesaba think strong validation is ok, I'm good to go, but even then prefer two PRs to be able to easily bailout (revert) just the validation.

I fully agree with you.

Removed the validations from this PR, so it is only fixing the bug and updating documentation. I also updated the release note.

mimowo · 2025-02-18T14:47:36Z

/lgtm
/approve
/cherry-pick release-0.10 release-0.9

k8s-infra-cherrypick-robot · 2025-02-18T14:47:39Z

@mimowo: once the present PR merges, I will cherry-pick it on top of release-0.10 in a new PR and assign it to you.

In response to this:

/lgtm
/approve
/cherry-pick release-0.10 release-0.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2025-02-18T14:47:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gabesaba, mimowo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [mimowo]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-02-18T14:47:44Z

LGTM label has been added.

Git tree hash: f7ddced2404225cdff9d823bdb5ca949983d2ded

mimowo · 2025-02-18T15:15:30Z

Leaving the hold label to release to @gabesaba

tenzen-y

/hold
for API comment appearance.

tenzen-y · 2025-02-18T15:44:13Z

charts/kueue/templates/crd/kueue.x-k8s.io_clusterqueues.yaml

@gabesaba Could you investigate the reason why new API comments are not recorded in CRD?

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 6, 2025

k8s-ci-robot requested review from mimowo and tenzen-y February 6, 2025 14:50

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 6, 2025

gabesaba mentioned this pull request Feb 6, 2025

Infinite preemption loop in fair sharing #3779

Open

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 6, 2025

tenzen-y reviewed Feb 6, 2025

View reviewed changes

gabesaba mentioned this pull request Feb 7, 2025

FairSharing Preemption Configuration #4173

Open

3 tasks

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 7, 2025

Do not consider preemption.borrowWithinCohort in FairSharing preemptions

0fe068c

gabesaba force-pushed the fair_sharing_preemption_loop branch from a2a965f to 1d96117 Compare February 17, 2025 11:15

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 17, 2025

gabesaba force-pushed the fair_sharing_preemption_loop branch from 1d96117 to 28653b2 Compare February 17, 2025 11:18

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 17, 2025

gabesaba force-pushed the fair_sharing_preemption_loop branch 2 times, most recently from 40a8211 to 88c83fb Compare February 17, 2025 11:43

gabesaba force-pushed the fair_sharing_preemption_loop branch from 88c83fb to 9c2c491 Compare February 17, 2025 11:58

gabesaba force-pushed the fair_sharing_preemption_loop branch from 9c2c491 to 9e733cf Compare February 17, 2025 12:46

Update Documentation

8390d77

gabesaba force-pushed the fair_sharing_preemption_loop branch from 9e733cf to 8390d77 Compare February 18, 2025 12:52

gabesaba mentioned this pull request Feb 18, 2025

Make Preemption.BorrowWithinCohort Incompatible with FairSharing #4298

Open

tenzen-y mentioned this pull request Feb 18, 2025

Prepare the FlavorFungibility feature to graduate to GA #4238

Open

3 tasks

k8s-ci-robot assigned mimowo Feb 18, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 18, 2025

tenzen-y reviewed Feb 18, 2025

View reviewed changes

mimowo mentioned this pull request Feb 19, 2025

Allow publishing Kueue images prior to merge to the main branch #4310

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not consider preemption.borrowWithinCohort in FairSharing preemptions #4165

Do not consider preemption.borrowWithinCohort in FairSharing preemptions #4165

gabesaba commented Feb 6, 2025 •

edited

Loading

netlify bot commented Feb 6, 2025 •

edited

Loading

tenzen-y commented Feb 6, 2025

mimowo commented Feb 6, 2025

tenzen-y commented Feb 6, 2025

tenzen-y left a comment

mimowo commented Feb 7, 2025 •

edited

Loading

mimowo commented Feb 7, 2025

tenzen-y commented Feb 7, 2025

gabesaba commented Feb 7, 2025 •

edited

Loading

mimowo commented Feb 7, 2025

mimowo commented Feb 7, 2025

gabesaba commented Feb 7, 2025

mimowo commented Feb 10, 2025

tenzen-y commented Feb 10, 2025

gabesaba commented Feb 17, 2025

gabesaba commented Feb 17, 2025

k8s-ci-robot commented Feb 17, 2025 •

edited

Loading

mimowo commented Feb 17, 2025 •

edited

Loading

gabesaba commented Feb 18, 2025

tenzen-y commented Feb 18, 2025

gabesaba commented Feb 18, 2025

mimowo commented Feb 18, 2025

k8s-infra-cherrypick-robot commented Feb 18, 2025

k8s-ci-robot commented Feb 18, 2025

k8s-ci-robot commented Feb 18, 2025

mimowo commented Feb 18, 2025

tenzen-y left a comment •

edited

Loading

tenzen-y Feb 18, 2025

Do not consider preemption.borrowWithinCohort in FairSharing preemptions #4165

Are you sure you want to change the base?

Do not consider preemption.borrowWithinCohort in FairSharing preemptions #4165

Conversation

gabesaba commented Feb 6, 2025 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

netlify bot commented Feb 6, 2025 • edited Loading

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

tenzen-y commented Feb 6, 2025

mimowo commented Feb 6, 2025

tenzen-y commented Feb 6, 2025

tenzen-y left a comment

Choose a reason for hiding this comment

mimowo commented Feb 7, 2025 • edited Loading

mimowo commented Feb 7, 2025

tenzen-y commented Feb 7, 2025

gabesaba commented Feb 7, 2025 • edited Loading

mimowo commented Feb 7, 2025

mimowo commented Feb 7, 2025

gabesaba commented Feb 7, 2025

mimowo commented Feb 10, 2025

tenzen-y commented Feb 10, 2025

gabesaba commented Feb 17, 2025

gabesaba commented Feb 17, 2025

k8s-ci-robot commented Feb 17, 2025 • edited Loading

mimowo commented Feb 17, 2025 • edited Loading

gabesaba commented Feb 18, 2025

tenzen-y commented Feb 18, 2025

gabesaba commented Feb 18, 2025

mimowo commented Feb 18, 2025

k8s-infra-cherrypick-robot commented Feb 18, 2025

k8s-ci-robot commented Feb 18, 2025

k8s-ci-robot commented Feb 18, 2025

mimowo commented Feb 18, 2025

tenzen-y left a comment • edited Loading

Choose a reason for hiding this comment

tenzen-y Feb 18, 2025

Choose a reason for hiding this comment

gabesaba commented Feb 6, 2025 •

edited

Loading

netlify bot commented Feb 6, 2025 •

edited

Loading

mimowo commented Feb 7, 2025 •

edited

Loading

gabesaba commented Feb 7, 2025 •

edited

Loading

k8s-ci-robot commented Feb 17, 2025 •

edited

Loading

mimowo commented Feb 17, 2025 •

edited

Loading

tenzen-y left a comment •

edited

Loading