-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-28647: tuned: operand: add support for deferred updates #1019
base: master
Are you sure you want to change the base?
Conversation
@ffromani: This pull request references Jira Issue OCPBUGS-28647, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@ffromani: This pull request references Jira Issue OCPBUGS-28647, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ab08d68
to
7a63c46
Compare
7a63c46
to
30cdab0
Compare
30cdab0
to
5376eac
Compare
f0b7ece
to
2314db4
Compare
/retest |
stil looks as infra issue to me. |
04635db
to
f0873a9
Compare
the PR is now reviewable |
add support to report the deferred status in conditions. Signed-off-by: Francesco Romani <[email protected]>
TunedProfiles -> tunedProfiles trivial rename. The function was not used elsewhere anyway. Signed-off-by: Francesco Romani <[email protected]>
rework profilesExtract and helper functions to make more testable, enable round-tripping, and add helpers which will be used in the deferred update functionality. Signed-off-by: Francesco Romani <[email protected]>
Simplify the interface packing the many return values in a single struct, to enable further extensibility. No intended changes in behavior. Signed-off-by: Francesco Romani <[email protected]>
Gather the node name once and use it in the lifecycle of the controller - is not supposed to change anyway. Bubble up the client creation, so we don't need to keep around the kubeconfig. Signed-off-by: Francesco Romani <[email protected]>
Align to left for readability with no intended changes in behavior. Signed-off-by: Francesco Romani <[email protected]>
Read extracted profiles on startup and reconstruct the node state, to be used ro implement the deferred updates feature. Please note all the operations are non-destructive read-only so pretty safe. Signed-off-by: Francesco Romani <[email protected]>
The deferred updates feature need to do specific actions when the tuned daemon is reloaded (and only then). Add hook point with specific code for that. Signed-off-by: Francesco Romani <[email protected]>
Rework how tuned controller updates status to make room for the deferred update feature. Signed-off-by: Francesco Romani <[email protected]>
Now that all the pieces are in place, we can set and use the deferred update flag to implement the desired behavior. Signed-off-by: Francesco Romani <[email protected]>
clarify the logs when the tuned controller reacts to a write event. Signed-off-by: Francesco Romani <[email protected]>
Trigger an event when the recommend file is written, and always do that even if does not change. This is needed to react properly when a deferred update is un-deferred by editing the tuned object. Without a trigger in this state the k8s object status wouldn't be correctly updated. Signed-off-by: Francesco Romani <[email protected]>
Signed-off-by: Francesco Romani <[email protected]>
747e901
to
5672523
Compare
@ffromani Will this apply the profile even when you just kill the tuned pod on a node? Or is there logic to wait for clean reboot (say using a /var/run/ file or its absence)? |
we will apply the profile on each tuned restart, not on node reboot |
/retest |
1 similar comment
/retest |
I wonder if my changes are causing this CI failure. Testing on my cluster revelead no issue whatsoever though. |
Good find. Yeah, I suspect updates or flips with |
|
@ffromani: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/hold let's stop retesting until #1065 sheds some light here |
From my testing, yes, if you delete the So far, I've noticed one issue. If I reboot the node and the system has a profile with deferred annotation set, I still see:
even though the profile has been successfully applied. |
thanks, this is new. I'll check once I get rid of the CI woes |
I wonder if that could be related. CI doesn't like things like CO objects progressing, etc. Another thing is whether we want to block the update until the next reboot, i.e. what @MarSik hinted at. Should the pod deletion apply the deferred update? |
This needs to be verified (by yours truly) but the failure seems into the early stage of CI setup, well before the e2e tests run. And we create profiles with the deferred annotation only in the newly added e2e tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work. I like the changes. I found only a few nits so far, but I haven't finished the full review, it is a lot of code. Sharing a part of it.
return profileData == string(content) | ||
} | ||
|
||
// ProfilesExtract extracts TuneD daemon profiles to tunedProfilesDirCustom directory. | ||
// Returns: | ||
// - True if the data in the to-be-extracted recommended profile or the profiles being | ||
// included from the current recommended profile have changed. | ||
// - If the data changed, the fingerprint of the new profile, or "" otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SuperNit: Let's be consistent.
s/or "" otherwise/or "" otherwise./
Similarly elsewhere, for example in the Daemon/Change struct
. I try to add punctuation when a line starts a comment and feels like a sentence/paragraph. I typically don't add it for very short comments that don't feel like a sentence. These are typically on the same line with a code.
recommendedProfileDeps[recommendedProfile] = true | ||
} | ||
extracted := map[string]bool{} // TuneD profile names present in TuneD CR and successfully extracted to tunedProfilesDirCustom | ||
if len(recommendedProfile) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a "heritage" from my previous code, but I believe that this check should be moved to tuned_parser.go
. Then, we can probably remove this function completely and just move the assignment
// Add the recommended profile itself.
recommendedProfileDeps[recommendedProfile] = true
If you agree, then, there'll be little point to have profilesExtractPathRecommends()
too and we can have the whole code in ProfilesExtract()
as it was before and there will be no need to document/comment/explain the new functions.
return change, profilesFP, nil | ||
} | ||
|
||
func profilesFingerprint(profiles []tunedv1.TunedProfile, recommendedProfile string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this function deserves a comment. Something along the lines of:
// profilesFingerprint returns a hash of `recommendedProfile` name joined with the data sections of all TuneD profiles in the `profiles` slice.
func profilesFingerprint(profiles []tunedv1.TunedProfile, recommendedProfile string) string { | ||
h := sha256.New() | ||
h.Write([]byte(recommendedProfile)) | ||
for _, prof := range profiles { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect any issues with profile ordering? Will they always come ordered the same way, e.g. profile a
prior to profile b
and not vice versa the next time?
return change, profilesFP, extracted, recommendedProfileDeps, nil | ||
} | ||
|
||
func profilesRepackPath(rfPath, profilesRootDir string) ([]tunedv1.TunedProfile, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need rfPath
, cannot we just use tunedRecommendFile
? Similarly profilesRootDir
.
Is the name Perl inspired by pack/unpack functions? I believe this function deserves a comment and possibly even function renaming. Comment something along the lines of:
profiles(Re)[Pp]ackFromPath reconstructs the slice of TunedProfile objects out of the TuneD configuration files from `tunedProfilesDirCustom`.
return profiles, recommendedProfile, nil | ||
} | ||
|
||
func ProfilesRepack() ([]tunedv1.TunedProfile, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to export this? Also, do we need such a "shallow" function?
klog.Error(err.Error()) | ||
return false // retry later | ||
} | ||
func (c *Controller) changeSyncerPostReload(change Change) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we need some comments to describe the general purpose of this function. Possibly even adjust its name to changeSyncerPostDaemonReload
?
} | ||
|
||
klog.Infof("changeSyncerPostReload(): current effective profile fingerprint %q -> %q", c.daemon.profileFingerprintEffective, profileFP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log this every time even though the fingerprint hasn't changed? I.e. does it bring any value or are we just poluting logs?
klog.Infof("changeSyncer(%s)", change.String()) // TODO v=2 | ||
defer klog.Infof("changeSyncer(%s) done", change.String()) // TODO v=2 | ||
// Sync internal status after a TuneD reload | ||
_ = c.changeSyncerPostReload(change) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the return value for anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a few more comments. In my view, the main things that need to be resolved:
- CI issues
- As agreed over Slack, the pod restart should not apply a new profile when deferred annotation is set.
- co/node-tuning status
After that, I'll do another round of review and testing.
@@ -88,7 +88,7 @@ func InitializeStatusConditions() []tunedv1.ProfileStatusCondition { | |||
// 'status' contains all the information necessary for creating a new slice of | |||
// conditions apart from LastTransitionTime, which is set based on checking the | |||
// old conditions. | |||
func computeStatusConditions(status Bits, stderr string, conditions []tunedv1.ProfileStatusCondition) []tunedv1.ProfileStatusCondition { | |||
func computeStatusConditions(status Bits, message string, conditions []tunedv1.ProfileStatusCondition) []tunedv1.ProfileStatusCondition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An additional parameter was added to the function. Let's explain what it does by updating the comments above.
} | ||
|
||
var message string | ||
deferred := (change.deferred || util.HasDeferredUpdateAnnotation(profile.Annotations)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both or is util.HasDeferredUpdateAnnotation(profile.Annotations)
enough?
klog.V(1).Infof("write event on: %s", fsEvent.Name) | ||
// Notify the event processor that the TuneD daemon calculated new kernel command-line parameters. | ||
klog.Infof("write event on: %s", fsEvent.Name) | ||
// Notify the event processor that profiles where unpacked on the node or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a comment which is slightly misleading. We're watching writes to bootcmdline and recommend file, not unpacking profiles on the node. They are unpacked, but that's not why the write event was triggered.
} | ||
|
||
klog.Infof("changeSyncerPostReload(): current effective profile fingerprint %q -> %q", c.daemon.profileFingerprintEffective, profileFP) | ||
c.daemon.profileFingerprintEffective = profileFP | ||
c.daemon.status &= ^scDeferred // force clear even if it was never set. No biggie. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop the drop the familiar "No biggie."?
} else if (status & scDeferred) != 0 { | ||
tunedDegradedCondition.Status = corev1.ConditionTrue | ||
tunedDegradedCondition.Reason = "TunedDeferredUpdate" | ||
tunedDegradedCondition.Message = "Profile will be applied at the next daemon restart" + deferredMessage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably be consistent here
s/next daemon restart/next daemon restart: /
@@ -0,0 +1,407 @@ | |||
package e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the test in e2e/deferred/
actually run? I don't see the Makefile
adjusted so that they run automatically.
Users are able to make tuning changes through NTO/TuneD by updating the CRs on a running system.
When the tuning change is done, the NTO sends a SIGHUP to the tuned process, which results in the tuned process rolling back ALL the tuning and then reapplying it.
In some key use cases, this disrupts the workload behavior.
We add an annotation that cause the update not to be applied immediately, but to be deferred till the next tuned restart. The NTO operator will continue merging the tuned objects into the special
rendered
object as it does today.The deferred annotation will be "sticky". If even just one of the current tuned object (e.g. just the perfprofile one) have the annotation, the rendered tuned will.
However, if there is a deferred updated pending but the tuned objects are edited in such a way the final
rendered
update will be immediate (= no deferred annotation), then the immediate update will be applied overwriting the pending deferred update, which will then be lost.The deferred updates can stack, meaning a deferred update can be applied on top of a previous deferred update; the latter will just overwrite the former as it never was received. An immediate update can overwrite a deferred update, and clear it.
Like the immediate update, the NTO operand will take care of applying the deferred updates, as such:
The NTO operand will gain another chance to process tuned objects at startup, right before the main loops start.
If a tuned object is detected and it has NOT the deferred annotation:
If a tuned object is detected and it DOES HAVE the deferred annotation:
IOW, the startup reconcile code will engage IFF the rendered tuned has the deferred annotation and the filesystem flag does exist. Likewise, the regular reconcile loop will only set the filesystem flag if detects a rendered tuned with the deferred annotation.
The default is obviously kept to immediate updates, for backward compatibility.