Skip to content
This repository has been archived by the owner on Apr 30, 2020. It is now read-only.

Pod disruption budget to manage availability #32

Open
JohnStrunk opened this issue Jul 9, 2018 · 1 comment
Open

Pod disruption budget to manage availability #32

JohnStrunk opened this issue Jul 9, 2018 · 1 comment
Labels
feature New feature or request

Comments

@JohnStrunk
Copy link
Member

Describe the feature you'd like to have.
The operator should maintain a pod disruption budget for the Gluster cluster pods to prevent voluntary disruptions from hurting service availability. After a Gluster pod is down for any reason, the data hosted on that pod will likely need to be healed before the next outage can be fully tolerated. Having a disruption budget will prevent kubernetes from voluntarily taking down a pod until the proper number are up and healthy.

What is the value to the end user? (why is it a priority?)
Users expect storage to be continuously available, through both planned and unplanned events. Having properly maintained disruption budgets will prevent voluntary events (upgrades, etc.) from causing outages.

How will we know we have a good solution? (acceptance criteria)

  • The operator should manage a pod disruption budget object that refers to the gluster pods
  • The operator should update the min available number based on the size of the cluster

Additional context
This item will need some investigation (and may not actually be usable):

  • We would like to consider a pod "disrupted"/unhealthy if it has pending heals on any of its volumes.
    • Is having the health check reflect pending heals the correct approach?
    • Would an extended period of unhealthy-ness cause the pod to be killed (we don't want that)?
  • As a first cut the operator would set min available to be (nodes - 1), but this is overly conservative.
    • An alternate approach would be to have a budget per AZ, requiring (az_nodes - 1) to be available. This would permit more parallelism during upgrades.
@JohnStrunk JohnStrunk added the feature New feature or request label Jul 9, 2018
@JohnStrunk
Copy link
Member Author

Upon further discussion, the above solution of having the PBD w/ min available of (n - 1) and using the pod readiness probe to indicate pending heals is flawed. By signaling "unready", the pod makes itself ineligible to receive traffic that is redirected via a service. Since we intend to use a service as the primary method for contacting the Gluster cluster, this is undesirable.

The suggested approach is to use the PDB's .spec.minAvailable field and have the operator manually toggle it between N (when there are pending heals) and N-1 (when all volumes are fully in-sync).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant