-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7939
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,6 +75,35 @@ type SchedulingPolicy struct { | |
TopologySpreadConstraints []corev1.TopologySpreadConstraint `json:"topologySpreadConstraints,omitempty"` | ||
} | ||
|
||
// InstanceUpdateStrategy indicates the strategy that the InstanceSet | ||
// controller will use to perform updates. It includes any additional parameters | ||
// necessary to perform the update for the indicated strategy. | ||
type InstanceUpdateStrategy struct { | ||
// Partition indicates the number of pods that should be updated during a rolling update. | ||
// The remaining pods will remain untouched. This is helpful in defining how many pods | ||
// should participate in the update process. The update process will follow the order | ||
// of pod names in descending lexicographical (dictionary) order. The default value is | ||
// Replicas (i.e., update all pods). | ||
// +optional | ||
Partition *int32 `json:"partition,omitempty"` | ||
// The maximum number of pods that can be unavailable during the update. | ||
// Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%). | ||
// Absolute number is calculated from percentage by rounding up. This can not be 0. | ||
// Defaults to 1. The field applies to all pods. That means if there is any unavailable pod, | ||
// it will be counted towards MaxUnavailable. | ||
// +optional | ||
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The term 'maxUnavailable' is misleading. In the original StatefulSet, updates always involved restarting pods, and any update to a pod would cause it to become unavailable. Therefore, controlling the number of maxUnavailable pods represented the level of concurrency. However, in InstanceSet, if we continue to use 'maxUnavailable', strictly speaking, all non-restarting updates (i.e., those that don't cause the pod to become unavailable) would not be controlled by this parameter. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why should we control the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because 'non-restarting update' is also a change in production environment, it should be able to be controlled by a gradual upgrade strategy. For example, it's reasonable to allow users to make changes to a small number of replicas first, and after verification, gradually roll out to more replicas. e.g. A 'non-restarting update' might involve parameter modifications, which could potentially lead to issues such as performance degradation. Or, it might be an IP whitelist change, which could inadvertently block legitimate traffic from apps. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To summarize, incorrect 'non-restarting updates' can potentially harm business continuity. Therefore, users may require a gradual update process. This is something we need to consider. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @weicao I think, for the 'non-restarting update,' it seems that we don't have a way to automate the control. How about we use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When you have multiple instance templates, how do you plan to use partitions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What I mean is, say if you have 3 templates and a total of 10 replicas, when you set the partition to 7 (upgrading the 8th, 9th, and 10th replicas), it becomes difficult to determine which templates these 8th, 9th, and 10th replicas belong to. This makes it challenging to verify whether the updated replicas meet the expected outcomes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@weicao We have also considered this issue. However, based on our current requirements, we do not need to specify a separate partition for gray upgrades in multiple templates. Nevertheless, I understand the need for multi-template partitioning, and we can discuss and design a solution for this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, our requirement is only for the partition to be able to perform a global update in alphabetical order. |
||
// Members(Pods) update strategy. | ||
// | ||
// - serial: update Members one by one that guarantee minimum component unavailable time. | ||
// - bestEffortParallel: update Members in parallel that guarantee minimum component un-writable time. | ||
// - parallel: force parallel | ||
// | ||
// +kubebuilder:validation:Enum={Serial,BestEffortParallel,Parallel} | ||
// +optional | ||
MemberUpdateStrategy *MemberUpdateStrategy `json:"memberUpdateStrategy,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 'memberUpdateStrategy' and 'maxUnavailable' are not orthogonal.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so 'maxUnavailable' takes effect when 'memberUpdateStrategy' is set to 'bestEffortParallel' and 'parallel'? |
||
} | ||
|
||
// Range represents a range with a start and an end value. | ||
// It is used to define a continuous segment. | ||
type Range struct { | ||
|
@@ -326,10 +355,9 @@ type InstanceSetSpec struct { | |
// Indicates the StatefulSetUpdateStrategy that will be | ||
// employed to update Pods in the InstanceSet when a revision is made to | ||
// Template. | ||
// UpdateStrategy.Type will be set to appsv1.OnDeleteStatefulSetStrategyType if MemberUpdateStrategy is not nil | ||
// | ||
// Note: This field will be removed in future version. | ||
UpdateStrategy appsv1.StatefulSetUpdateStrategy `json:"updateStrategy,omitempty"` | ||
UpdateStrategy InstanceUpdateStrategy `json:"updateStrategy,omitempty"` | ||
|
||
// A list of roles defined in the system. | ||
// | ||
|
@@ -348,11 +376,13 @@ type InstanceSetSpec struct { | |
|
||
// Members(Pods) update strategy. | ||
// | ||
// Deprecated since v0.9.0 | ||
// - serial: update Members one by one that guarantee minimum component unavailable time. | ||
// - bestEffortParallel: update Members in parallel that guarantee minimum component un-writable time. | ||
// - parallel: force parallel | ||
// | ||
// +kubebuilder:validation:Enum={Serial,BestEffortParallel,Parallel} | ||
// +kubebuilder:deprecatedversion:warning="This field has been deprecated since 0.9.0" | ||
// +optional | ||
MemberUpdateStrategy *MemberUpdateStrategy `json:"memberUpdateStrategy,omitempty"` | ||
|
||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need
partition
? @free6omAfter introducing instance templates, the naming of Pods managed by an InstanceSet follows this pattern: "comp-0, comp-1, comp-tpl0-0, comp-tpl1-0, comp-tpl1-1"
Unlike before, the Pods no longer have a linear ordinal numbering scheme. This makes specifying a
partition
much more challenging.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I feel the same way. Currently, the
partition
feature allows Pods to be updated in a rolling update based on dictionary order from largest to smallest, but it seems there is no way to perform a rolling update on a specific Template at the moment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How we do a
RollingUpdate
then?