Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7939

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions apis/apps/v1alpha1/cluster_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,25 @@ type UserResourceRefs struct {
ConfigMapRefs []ConfigMapRef `json:"configMapRefs,omitempty"`
}

// InstanceUpdateStrategy indicates the strategy that the InstanceSet
// controller will use to perform updates.
type InstanceUpdateStrategy struct {
// Partition indicates the number of pods that should be updated during a rolling update.
// The remaining pods will remain untouched. This is helpful in defining how many pods
// should participate in the update process. The update process will follow the order
// of pod names in descending lexicographical (dictionary) order. The default value is
// Replicas (i.e., update all pods).
// +optional
Partition *int32 `json:"partition,omitempty"`
// The maximum number of pods that can be unavailable during the update.
// Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
// Absolute number is calculated from percentage by rounding up. This can not be 0.
// Defaults to 1. The field applies to all pods. That means if there is any unavailable pod,
// it will be counted towards MaxUnavailable.
// +optional
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`
}

// InstanceTemplate allows customization of individual replica configurations in a Component.
type InstanceTemplate struct {
// Name specifies the unique name of the instance Pod created using this InstanceTemplate.
Expand Down Expand Up @@ -800,6 +819,13 @@ type ClusterComponentSpec struct {
// +optional
UpdateStrategy *UpdateStrategy `json:"updateStrategy,omitempty"`

// Indicates the InstanceUpdateStrategy that will be
// employed to update Pods in the InstanceSet when a revision is made to
// Template.
//
// +optional
InstanceUpdateStrategy InstanceUpdateStrategy `json:"instanceUpdateStrategy,omitempty"`

// Controls the concurrency of pods during initial scale up, when replacing pods on nodes,
// or when scaling down. It only used when `PodManagementPolicy` is set to `Parallel`.
// The default Concurrency is 100%.
Expand Down
7 changes: 7 additions & 0 deletions apis/apps/v1alpha1/component_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,13 @@ type ComponentSpec struct {
// +optional
ServiceAccountName string `json:"serviceAccountName,omitempty"`

// Indicates the InstanceUpdateStrategy that will be
// employed to update Pods in the InstanceSet when a revision is made to
// Template.
//
// +optional
InstanceUpdateStrategy InstanceUpdateStrategy `json:"instanceUpdateStrategy,omitempty"`

// Controls the concurrency of pods during initial scale up, when replacing pods on nodes,
// or when scaling down. It only used when `PodManagementPolicy` is set to `Parallel`.
// The default Concurrency is 100%.
Expand Down
27 changes: 27 additions & 0 deletions apis/apps/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 32 additions & 2 deletions apis/workloads/v1alpha1/instanceset_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,35 @@ type SchedulingPolicy struct {
TopologySpreadConstraints []corev1.TopologySpreadConstraint `json:"topologySpreadConstraints,omitempty"`
}

// InstanceUpdateStrategy indicates the strategy that the InstanceSet
// controller will use to perform updates. It includes any additional parameters
// necessary to perform the update for the indicated strategy.
type InstanceUpdateStrategy struct {
// Partition indicates the number of pods that should be updated during a rolling update.
// The remaining pods will remain untouched. This is helpful in defining how many pods
// should participate in the update process. The update process will follow the order
// of pod names in descending lexicographical (dictionary) order. The default value is
// Replicas (i.e., update all pods).
// +optional
Partition *int32 `json:"partition,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need partition? @free6om

After introducing instance templates, the naming of Pods managed by an InstanceSet follows this pattern: "comp-0, comp-1, comp-tpl0-0, comp-tpl1-0, comp-tpl1-1"

Unlike before, the Pods no longer have a linear ordinal numbering scheme. This makes specifying a partition much more challenging.

Copy link
Contributor Author

@YTGhost YTGhost Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I feel the same way. Currently, the partition feature allows Pods to be updated in a rolling update based on dictionary order from largest to smallest, but it seems there is no way to perform a rolling update on a specific Template at the moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we do a RollingUpdate then?

// The maximum number of pods that can be unavailable during the update.
// Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
// Absolute number is calculated from percentage by rounding up. This can not be 0.
// Defaults to 1. The field applies to all pods. That means if there is any unavailable pod,
// it will be counted towards MaxUnavailable.
// +optional
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`
Copy link
Contributor

@weicao weicao Aug 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term 'maxUnavailable' is misleading. In the original StatefulSet, updates always involved restarting pods, and any update to a pod would cause it to become unavailable. Therefore, controlling the number of maxUnavailable pods represented the level of concurrency.

However, in InstanceSet, if we continue to use 'maxUnavailable', strictly speaking, all non-restarting updates (i.e., those that don't cause the pod to become unavailable) would not be controlled by this parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we control the non-restarting updates as they are taking effect instantly after the Pods being updated? Do all the non-restarting updates in one reconciliation loop seems no difference with doing them in several reconciliation loops.

Copy link
Contributor

@weicao weicao Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because 'non-restarting update' is also a change in production environment, it should be able to be controlled by a gradual upgrade strategy. For example, it's reasonable to allow users to make changes to a small number of replicas first, and after verification, gradually roll out to more replicas.

e.g. A 'non-restarting update' might involve parameter modifications, which could potentially lead to issues such as performance degradation. Or, it might be an IP whitelist change, which could inadvertently block legitimate traffic from apps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize, incorrect 'non-restarting updates' can potentially harm business continuity. Therefore, users may require a gradual update process. This is something we need to consider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weicao I think, for the 'non-restarting update,' it seems that we don't have a way to automate the control. How about we use partition to manually control it instead?

Copy link
Contributor

@weicao weicao Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weicao I think, for the 'non-restarting update,' it seems that we don't have a way to automate the control. How about we use partition to manually control it instead?

When you have multiple instance templates, how do you plan to use partitions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, say if you have 3 templates and a total of 10 replicas, when you set the partition to 7 (upgrading the 8th, 9th, and 10th replicas), it becomes difficult to determine which templates these 8th, 9th, and 10th replicas belong to. This makes it challenging to verify whether the updated replicas meet the expected outcomes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, say if you have 3 templates and a total of 10 replicas, when you set the partition to 7 (upgrading the 8th, 9th, and 10th replicas), it becomes difficult to determine which templates these 8th, 9th, and 10th replicas belong to. This makes it challenging to verify whether the updated replicas meet the expected outcomes.

@weicao We have also considered this issue. However, based on our current requirements, we do not need to specify a separate partition for gray upgrades in multiple templates. Nevertheless, I understand the need for multi-template partitioning, and we can discuss and design a solution for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, our requirement is only for the partition to be able to perform a global update in alphabetical order.

// Members(Pods) update strategy.
//
// - serial: update Members one by one that guarantee minimum component unavailable time.
// - bestEffortParallel: update Members in parallel that guarantee minimum component un-writable time.
// - parallel: force parallel
//
// +kubebuilder:validation:Enum={Serial,BestEffortParallel,Parallel}
// +optional
MemberUpdateStrategy *MemberUpdateStrategy `json:"memberUpdateStrategy,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'memberUpdateStrategy' and 'maxUnavailable' are not orthogonal.

  • When 'memberUpdateStrategy' is set to 'serial', 'maxUnavailable' has no effect, right?
  • When 'memberUpdateStrategy' is set to 'bestEffortParallel', the concurrency is calculated based on quorum, so 'maxUnavailable' should not have an effect either, right?
  • Therefore, does 'maxUnavailable' only take effect when 'memberUpdateStrategy' is set to 'parallel'?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but MaxUnavailable means no more than a total number of Pods defined by MaxAvailable should be unavailable. It's upper bound.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so 'maxUnavailable' takes effect when 'memberUpdateStrategy' is set to 'bestEffortParallel' and 'parallel'?

}

// Range represents a range with a start and an end value.
// It is used to define a continuous segment.
type Range struct {
Expand Down Expand Up @@ -326,10 +355,9 @@ type InstanceSetSpec struct {
// Indicates the StatefulSetUpdateStrategy that will be
// employed to update Pods in the InstanceSet when a revision is made to
// Template.
// UpdateStrategy.Type will be set to appsv1.OnDeleteStatefulSetStrategyType if MemberUpdateStrategy is not nil
//
// Note: This field will be removed in future version.
UpdateStrategy appsv1.StatefulSetUpdateStrategy `json:"updateStrategy,omitempty"`
UpdateStrategy InstanceUpdateStrategy `json:"updateStrategy,omitempty"`

// A list of roles defined in the system.
//
Expand All @@ -348,11 +376,13 @@ type InstanceSetSpec struct {

// Members(Pods) update strategy.
//
// Deprecated since v0.9.0
// - serial: update Members one by one that guarantee minimum component unavailable time.
// - bestEffortParallel: update Members in parallel that guarantee minimum component un-writable time.
// - parallel: force parallel
//
// +kubebuilder:validation:Enum={Serial,BestEffortParallel,Parallel}
// +kubebuilder:deprecatedversion:warning="This field has been deprecated since 0.9.0"
// +optional
MemberUpdateStrategy *MemberUpdateStrategy `json:"memberUpdateStrategy,omitempty"`

Expand Down
30 changes: 30 additions & 0 deletions apis/workloads/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

54 changes: 54 additions & 0 deletions config/crd/bases/apps.kubeblocks.io_clusters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -655,6 +655,33 @@ spec:
- name
type: object
type: array
instanceUpdateStrategy:
description: |-
Indicates the InstanceUpdateStrategy that will be
employed to update Pods in the InstanceSet when a revision is made to
Template.
properties:
maxUnavailable:
anyOf:
- type: integer
- type: string
description: |-
The maximum number of pods that can be unavailable during the update.
Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
Absolute number is calculated from percentage by rounding up. This can not be 0.
Defaults to 1. The field applies to all pods. That means if there is any unavailable pod,
it will be counted towards MaxUnavailable.
x-kubernetes-int-or-string: true
partition:
description: |-
Partition indicates the number of pods that should be updated during a rolling update.
The remaining pods will remain untouched. This is helpful in defining how many pods
should participate in the update process. The update process will follow the order
of pod names in descending lexicographical (dictionary) order. The default value is
Replicas (i.e., update all pods).
format: int32
type: integer
type: object
instances:
description: |-
Allows for the customization of configuration values for each instance within a Component.
Expand Down Expand Up @@ -9772,6 +9799,33 @@ spec:
- name
type: object
type: array
instanceUpdateStrategy:
description: |-
Indicates the InstanceUpdateStrategy that will be
employed to update Pods in the InstanceSet when a revision is made to
Template.
properties:
maxUnavailable:
anyOf:
- type: integer
- type: string
description: |-
The maximum number of pods that can be unavailable during the update.
Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
Absolute number is calculated from percentage by rounding up. This can not be 0.
Defaults to 1. The field applies to all pods. That means if there is any unavailable pod,
it will be counted towards MaxUnavailable.
x-kubernetes-int-or-string: true
partition:
description: |-
Partition indicates the number of pods that should be updated during a rolling update.
The remaining pods will remain untouched. This is helpful in defining how many pods
should participate in the update process. The update process will follow the order
of pod names in descending lexicographical (dictionary) order. The default value is
Replicas (i.e., update all pods).
format: int32
type: integer
type: object
instances:
description: |-
Allows for the customization of configuration values for each instance within a Component.
Expand Down
27 changes: 27 additions & 0 deletions config/crd/bases/apps.kubeblocks.io_components.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,33 @@ spec:
- name
type: object
type: array
instanceUpdateStrategy:
description: |-
Indicates the InstanceUpdateStrategy that will be
employed to update Pods in the InstanceSet when a revision is made to
Template.
properties:
maxUnavailable:
anyOf:
- type: integer
- type: string
description: |-
The maximum number of pods that can be unavailable during the update.
Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
Absolute number is calculated from percentage by rounding up. This can not be 0.
Defaults to 1. The field applies to all pods. That means if there is any unavailable pod,
it will be counted towards MaxUnavailable.
x-kubernetes-int-or-string: true
partition:
description: |-
Partition indicates the number of pods that should be updated during a rolling update.
The remaining pods will remain untouched. This is helpful in defining how many pods
should participate in the update process. The update process will follow the order
of pod names in descending lexicographical (dictionary) order. The default value is
Replicas (i.e., update all pods).
format: int32
type: integer
type: object
instances:
description: |-
Allows for the customization of configuration values for each instance within a Component.
Expand Down
62 changes: 32 additions & 30 deletions config/crd/bases/workloads.kubeblocks.io_instancesets.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3947,6 +3947,7 @@ spec:
Members(Pods) update strategy.


Deprecated since v0.9.0
- serial: update Members one by one that guarantee minimum component unavailable time.
- bestEffortParallel: update Members in parallel that guarantee minimum component un-writable time.
- parallel: force parallel
Expand Down Expand Up @@ -12391,42 +12392,43 @@ spec:
Indicates the StatefulSetUpdateStrategy that will be
employed to update Pods in the InstanceSet when a revision is made to
Template.
UpdateStrategy.Type will be set to appsv1.OnDeleteStatefulSetStrategyType if MemberUpdateStrategy is not nil


Note: This field will be removed in future version.
properties:
rollingUpdate:
description: RollingUpdate is used to communicate parameters when
Type is RollingUpdateStatefulSetStrategyType.
properties:
maxUnavailable:
anyOf:
- type: integer
- type: string
description: |-
The maximum number of pods that can be unavailable during the update.
Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
Absolute number is calculated from percentage by rounding up. This can not be 0.
Defaults to 1. This field is alpha-level and is only honored by servers that enable the
MaxUnavailableStatefulSet feature. The field applies to all pods in the range 0 to
Replicas-1. That means if there is any unavailable pod in the range 0 to Replicas-1, it
will be counted towards MaxUnavailable.
x-kubernetes-int-or-string: true
partition:
description: |-
Partition indicates the ordinal at which the StatefulSet should be partitioned
for updates. During a rolling update, all pods from ordinal Replicas-1 to
Partition are updated. All pods from ordinal Partition-1 to 0 remain untouched.
This is helpful in being able to do a canary based deployment. The default value is 0.
format: int32
type: integer
type: object
type:
maxUnavailable:
anyOf:
- type: integer
- type: string
description: |-
The maximum number of pods that can be unavailable during the update.
Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
Absolute number is calculated from percentage by rounding up. This can not be 0.
Defaults to 1. The field applies to all pods. That means if there is any unavailable pod,
it will be counted towards MaxUnavailable.
x-kubernetes-int-or-string: true
memberUpdateStrategy:
description: |-
Type indicates the type of the StatefulSetUpdateStrategy.
Default is RollingUpdate.
Members(Pods) update strategy.


- serial: update Members one by one that guarantee minimum component unavailable time.
- bestEffortParallel: update Members in parallel that guarantee minimum component un-writable time.
- parallel: force parallel
enum:
- Serial
- BestEffortParallel
- Parallel
type: string
partition:
description: |-
Partition indicates the number of pods that should be updated during a rolling update.
The remaining pods will remain untouched. This is helpful in defining how many pods
should participate in the update process. The update process will follow the order
of pod names in descending lexicographical (dictionary) order. The default value is
Replicas (i.e., update all pods).
format: int32
type: integer
type: object
volumeClaimTemplates:
description: |-
Expand Down
Loading
Loading