OCPBUGS-28647: tuned: operand: add support for deferred updates #1019

ffromani · 2024-04-04T10:14:01Z

Users are able to make tuning changes through NTO/TuneD by updating the CRs on a running system.
When the tuning change is done, the NTO causes the tuned process to restarts, whose unwanted side effect is the tuned process rolling back ALL the tuning and then reapplying it.
In some key use cases, this disrupts the workload behavior.

We add an annotation that cause the update not to be applied immediately, but to be deferred till the next node restart. The NTO operator will continue handle the tuned objects as it does today.
This includes unpacking all profiles on the node.
The deferred annotation would be deduplicated from the tuned object to all the contained tunedprofiles. If the recommended profile computed for a node is originated from a tuned object which has the annotation, then the deferred annotation will be "sticky", and the resulting update will be deferred.

However, if there is a deferred updated pending but the tuned objects are edited to remove the annotation, then the update will become immediate (like they were before this feature and like they are by default without any annotation), and they be applied overwriting the pending deferred update, which will then be lost.
This applies in the case of the recommended profile will be computed from a non-deferred update.

The deferred updates can stack, meaning a deferred update can be applied on top of a previous deferred update; the latter will just overwrite the former as it never was received. An immediate update can overwrite a deferred update, and clear it.

Like the immediate update, the NTO operand will take care of applying the deferred updates, as such:

The NTO operand will gain another chance to process tuned objects at startup, right before the main loops start.

If a tuned object is detected and it has NOT the deferred annotation:

the regular reconcile code will work as today
the startup reconcile code will not engage

If a tuned object is detected and it DOES HAVE the deferred annotation:

the regular reconcile code will set a filesystem flag, will move the status DEGRADED and will ignore the object.
at the first followup operand restart, the startup reconcile code will notice the filesystem flag, apply the rendered tuned objects (granted it DOES have the deferred flag, otherwise it will log and skip), clear the filesystem flag. The DEGRADED state will be cleared.

IOW, the startup reconcile code will engage IFF the rendered tuned has the deferred annotation and the filesystem flag does exist. Likewise, the regular reconcile loop will only set the filesystem flag if detects a rendered tuned with the deferred annotation.

The default is obviously kept to immediate updates, for backward compatibility.

openshift-ci-robot · 2024-04-04T10:14:06Z

@ffromani: This pull request references Jira Issue OCPBUGS-28647, which is invalid:

expected the bug to target the "4.16.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Users are able to make tuning changes through NTO/TuneD by updating the CRs on a running system.
When the tuning change is done, the NTO sends a SIGHUP to the tuned process, which results in the tuned process rolling back ALL the tuning and then reapplying it.
In some key use cases, this disrupts the workload behavior.

We add an annotation that cause the update not to be applied immediately, but to be deferred till the next tuned restart. The NTO operator will continue merging the tuned objects into the special rendered object as it does today.
The deferred annotation will be "sticky". If even just one of the current tuned object (e.g. just the perfprofile one) have the annotation, the rendered tuned will.

However, if there is a deferred updated pending but the tuned objects are edited in such a way the final rendered update will be immediate (= no deferred annotation), then the immediate update will be applied overwriting the pending deferred update, which will then be lost.

The deferred updates can stack, meaning a deferred update can be applied on top of a previous deferred update; the latter will just overwrite the former as it never was received. An immediate update can overwrite a deferred update, and clear it.

Like the immediate update, the NTO operand will take care of applying the deferred updates, as such:

The NTO operand will gain another chance to process tuned objects at startup, right before the main loops start.

If a tuned object is detected and it has NOT the deferred annotation:

the regular reconcile code will work as today

the startup reconcile code will not engage

If a tuned object is detected and it DOES HAVE the deferred annotation:

the regular reconcile code will set a filesystem flag, will move the status DEGRADED and will ignore the object.

at the first followup operand restart, the startup reconcile code will notice the filesystem flag, apply the rendered tuned objects (granted it DOES have the deferred flag, otherwise it will log and skip), clear the filesystem flag. The DEGRADED state will be cleared.

IOW, the startup reconcile code will engage IFF the rendered tuned has the deferred annotation and the filesystem flag does exist. Likewise, the regular reconcile loop will only set the filesystem flag if detects a rendered tuned with the deferred annotation.

The default is obviously kept to immediate updates, for backward compatibility.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

ffromani · 2024-04-04T10:15:04Z

/jira refresh

openshift-ci-robot · 2024-04-04T10:15:10Z

@ffromani: This pull request references Jira Issue OCPBUGS-28647, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.16.0) matches configured target version for branch (4.16.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @gsr-shanks

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2024-04-04T10:15:14Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ffromani]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ffromani · 2024-05-09T06:40:49Z

/retest

ffromani · 2024-05-09T09:02:00Z

 ERRO[2024-05-09T07:59:44Z] 
  * could not run steps: step e2e-gcp-pao failed: "e2e-gcp-pao" pre steps failed: "e2e-gcp-pao" pod "e2e-gcp-pao-ipi-install-install" failed: could not watch pod: the pod ci-op-t8wxg3mq/e2e-gcp-pao-ipi-install-install failed after 1h0m10s (failed containers: test): ContainerFailed one or more containers exited

stil looks as infra issue to me.

ffromani · 2024-05-10T09:30:11Z

the PR is now reviewable

jmencak · 2024-07-18T08:07:58Z

@yanirq in Martin's presence, I think a high-level assessment if this does what we set out to do would be useful. What I have in mind how is this going to be used. Are we going to rely on users to set the deferred annotation in patch profiles? Or do we expect PPC to set this? If the latter, i believe a follow-up PR(s) will be necessary.

ffromani · 2024-07-18T08:14:14Z

@yanirq in Martin's presence, I think a high-level assessment if this does what we set out to do would be useful. What I have in mind how is this going to be used. Are we going to rely on users to set the deferred annotation in patch profiles? Or do we expect PPC to set this? If the latter, i believe a follow-up PR(s) will be necessary.

yes, this is the infra work. We will need followup PR(s) for the performanceprofile to consume this work, because performanceprofile is a higher level construct building on NTO.

ffromani · 2024-07-18T08:15:46Z

regarding the behavior of the change, we have quite some decent e2e test coverage which ensure and document the observable behavior. More reviews are welcome (and have always be welcome in the past months)

ffromani · 2024-07-22T06:10:46Z

I guess we are waiting for the acknowledge-critical-fixes-only label requirement to be lifted, aren't we?

jmencak · 2024-07-22T07:07:36Z

I guess we are waiting for the acknowledge-critical-fixes-only label requirement to be lifted, aren't we?

Sometimes these go away automatically, sometimes "slash test all" helps. Tried slash test all about 3 hours ago in one of my old PRs.

ffromani · 2024-07-22T07:37:46Z

/test all

ffromani · 2024-07-22T07:39:17Z

it seems the label requirement wasn't lifted yet :\

jmencak · 2024-07-22T08:05:52Z

it seems the label requirement wasn't lifted yet :\

/label acknowledge-critical-fixes-only

pkg/tuned/controller.go

Tal-or · 2024-07-22T08:22:14Z

@ffromani If we're on a hurry most of my comments can be postpone for future improvement.
besides that LGTM!

ffromani · 2024-07-22T08:27:59Z

/hold cancel

got another review

openshift-ci · 2024-07-22T14:12:29Z

@ffromani: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot · 2024-07-22T14:16:56Z

@ffromani: Jira Issue OCPBUGS-28647: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-28647 has been moved to the MODIFIED state.

In response to this:

Users are able to make tuning changes through NTO/TuneD by updating the CRs on a running system.
When the tuning change is done, the NTO causes the tuned process to restarts, whose unwanted side effect is the tuned process rolling back ALL the tuning and then reapplying it.
In some key use cases, this disrupts the workload behavior.

We add an annotation that cause the update not to be applied immediately, but to be deferred till the next node restart. The NTO operator will continue handle the tuned objects as it does today.
This includes unpacking all profiles on the node.
The deferred annotation would be deduplicated from the tuned object to all the contained tunedprofiles. If the recommended profile computed for a node is originated from a tuned object which has the annotation, then the deferred annotation will be "sticky", and the resulting update will be deferred.

However, if there is a deferred updated pending but the tuned objects are edited to remove the annotation, then the update will become immediate (like they were before this feature and like they are by default without any annotation), and they be applied overwriting the pending deferred update, which will then be lost.
This applies in the case of the recommended profile will be computed from a non-deferred update.

The deferred updates can stack, meaning a deferred update can be applied on top of a previous deferred update; the latter will just overwrite the former as it never was received. An immediate update can overwrite a deferred update, and clear it.

Like the immediate update, the NTO operand will take care of applying the deferred updates, as such:

The NTO operand will gain another chance to process tuned objects at startup, right before the main loops start.

If a tuned object is detected and it has NOT the deferred annotation:

the regular reconcile code will work as today

the startup reconcile code will not engage

If a tuned object is detected and it DOES HAVE the deferred annotation:

the regular reconcile code will set a filesystem flag, will move the status DEGRADED and will ignore the object.

at the first followup operand restart, the startup reconcile code will notice the filesystem flag, apply the rendered tuned objects (granted it DOES have the deferred flag, otherwise it will log and skip), clear the filesystem flag. The DEGRADED state will be cleared.

IOW, the startup reconcile code will engage IFF the rendered tuned has the deferred annotation and the filesystem flag does exist. Likewise, the regular reconcile loop will only set the filesystem flag if detects a rendered tuned with the deferred annotation.

The default is obviously kept to immediate updates, for backward compatibility.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

WIP xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

Simplify the flow aligning code to the left. No intended changes in behavior. xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

Simplify the flow aligning code to the left. No intended changes in behavior. xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

ffromani · 2024-07-23T12:37:20Z

followup: #1118

Fix the runtime directory to be forward compatible xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

* tuned: controller: pack profilesExtract return values in struct xref: #1019 (comment) Signed-off-by: Francesco Romani <[email protected]> * tuned: controller: align to left Simplify the flow aligning code to the left. No intended changes in behavior. xref: #1019 (comment) Signed-off-by: Francesco Romani <[email protected]> * tuned: controller: narrow down reapplySysctl Simplify the code with no intended changes in behavior. Signed-off-by: Francesco Romani <[email protected]> * e2e: deferred: add ginkgo labels and update the tags to match. This way we can easily skip the flaky tests The naming schema is made consistent with performanceprofile labels: test/e2e/performanceprofile/functests/utils/label/label.go note the recommended kube model is to use `CapitalizedNouns`, while we do `lowercase-dash-separated`. Signed-off-by: Francesco Romani <[email protected]> * e2e: skip flaky tests in nightly runs nightlies call `make test-e2e` so skip the flaky tests here. Presubmit lanes (= running pre-merge) should still run all tests Signed-off-by: Francesco Romani <[email protected]> * tuned: controller: fix run directory Fix the runtime directory to be forward compatible xref: #1019 (comment) Signed-off-by: Francesco Romani <[email protected]> --------- Signed-off-by: Francesco Romani <[email protected]>

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 4, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 4, 2024

openshift-ci bot requested review from MarSik and yanirq April 4, 2024 10:15

openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 4, 2024

openshift-ci bot requested a review from gsr-shanks April 4, 2024 10:15

ffromani changed the title ~~WIP: OCPBUGS-28647: tuned: operand: add support for deferred updates~~ WIP: POC: OCPBUGS-28647: tuned: operand: add support for deferred updates Apr 4, 2024

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 4, 2024

ffromani force-pushed the tuned-deferred-updates branch from ab08d68 to 7a63c46 Compare April 4, 2024 12:15

ffromani mentioned this pull request Apr 16, 2024

OCPBUGS-32469: Remove tuned/rendered object #1036

Merged

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 20, 2024

ffromani force-pushed the tuned-deferred-updates branch from 7a63c46 to 30cdab0 Compare April 30, 2024 11:58

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2024

ffromani force-pushed the tuned-deferred-updates branch from 30cdab0 to 5376eac Compare April 30, 2024 14:08

ffromani force-pushed the tuned-deferred-updates branch 4 times, most recently from f0b7ece to 2314db4 Compare May 8, 2024 17:19

ffromani force-pushed the tuned-deferred-updates branch 2 times, most recently from 04635db to f0873a9 Compare May 10, 2024 09:29

ffromani changed the title ~~WIP: POC: OCPBUGS-28647: tuned: operand: add support for deferred updates~~ OCPBUGS-28647: tuned: operand: add support for deferred updates May 10, 2024

ffromani mentioned this pull request Jul 21, 2024

CNF-12680:CNF-13488: Add performance profile status for hypershift support #1089

Merged

2 tasks

ffromani mentioned this pull request Jul 22, 2024

NO-JIRA: makefile: ensure bindata #1116

Merged

openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Jul 22, 2024

Tal-or reviewed Jul 22, 2024

View reviewed changes

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 22, 2024

openshift-merge-bot bot merged commit f00595e into openshift:master Jul 22, 2024

ffromani deleted the tuned-deferred-updates branch July 22, 2024 14:20

ffromani added a commit to ffromani/cluster-node-tuning-operator that referenced this pull request Jul 22, 2024

tuned: controller: pack profilesExtract return values in struct

df66093

xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

ffromani added a commit to ffromani/cluster-node-tuning-operator that referenced this pull request Jul 22, 2024

WIP: tuned: controller: align to left

045fd99

WIP xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

ffromani mentioned this pull request Jul 23, 2024

OCPBUGS-28647: deferred updates: cleanups #1119

Merged

ffromani mentioned this pull request Jul 23, 2024

OCPBUGS-36870: Remove tuned/rendered object #1110

Merged

ffromani added a commit to ffromani/cluster-node-tuning-operator that referenced this pull request Jul 23, 2024

tuned: controller: pack profilesExtract return values in struct

aeebf6f

xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

ffromani added a commit to ffromani/cluster-node-tuning-operator that referenced this pull request Jul 23, 2024

tuned: controller: fix run directory

985e19e

Fix the runtime directory to be forward compatible xref: openshift#1019 (comment) Signed-off-by: Francesco Romani <[email protected]>

openshift-ci-robot mentioned this pull request Aug 14, 2024

OCPBUGS-28647: tuned: distinguish deferred updates #1129

Merged

OCPBUGS-28647: tuned: operand: add support for deferred updates #1019

OCPBUGS-28647: tuned: operand: add support for deferred updates #1019

Uh oh!

Conversation

ffromani commented Apr 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Apr 4, 2024

Uh oh!

ffromani commented Apr 4, 2024

Uh oh!

openshift-ci-robot commented Apr 4, 2024

Uh oh!

openshift-ci bot commented Apr 4, 2024

Uh oh!

ffromani commented May 9, 2024

Uh oh!

ffromani commented May 9, 2024

Uh oh!

ffromani commented May 10, 2024

Uh oh!

jmencak commented Jul 18, 2024

Uh oh!

ffromani commented Jul 18, 2024

Uh oh!

ffromani commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ffromani commented Jul 22, 2024

Uh oh!

jmencak commented Jul 22, 2024

Uh oh!

ffromani commented Jul 22, 2024

Uh oh!

ffromani commented Jul 22, 2024

Uh oh!

jmencak commented Jul 22, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tal-or commented Jul 22, 2024

Uh oh!

ffromani commented Jul 22, 2024

Uh oh!

openshift-ci bot commented Jul 22, 2024

Uh oh!

openshift-ci-robot commented Jul 22, 2024

Uh oh!

ffromani commented Jul 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

ffromani commented Apr 4, 2024 •

edited

Loading

ffromani commented Jul 18, 2024 •

edited

Loading