Restoration of a member in a multi-node cluster #645

unmarshall · 2023-07-14T05:18:10Z

Describe the bug:

If an existing etcd member crashed and now has come up again, then if the data directory is not longer valid then for a multi-node setup, the data directory is removed and only limited number of attempts are made to add as learner. Now consider a case where more than 1 member goes down and both are trying to recover (in a 5 member cluster). The quorum is still there so it can happen that both of the member attempt to add themselves as learners and one of them will fail.

Expected behavior:
In scale-up case where adding the current candidate as a learner is repeatedly attempted (upto 6 times). Similar thing should also be done when a restoration of a member in a multi-node cluster requires it to be added as a learner.

ishan16696 · 2023-07-14T11:44:49Z

Now consider a case where more than 1 member goes down and both are trying to recover (in a 5 member cluster). The quorum is still there so it can happen that both of the member attempt to add themselves as learners and one of them will fail.

I agree with the concern but take this scenario in 5 member cluster:

etcd-0 --> leader
etcd-1 --> follower
etcd-2 --> follower
etcd-3 --> goes down due to data-dir corruption
etcd-4 --> goes down due to data-dir corruption
---
quorum is still there (3/5 is up)

Now, as you already mentioned backup-restore will detects during initialisation phase it is a single member restoration case, hence backup-restore will clean-up the old data-dir and then try to add them as a learner(non-voting member) at a same time.
But only 1 backup-restore will succeed in adding its corresponding etcd as a learner. Others initialisation call will fail

etcd-0 --> leader
etcd-1 --> follower
etcd-2 --> follower
etcd-3 --> learner --> get promoted to follower
etcd-4 --> still down as adding a learner call failed
---
quorum is there (4/5 is up)

Now, IMO initialisation will get re-trigger and backup-restore will re-detects this as single member restoration case again, hence this time it will get added as a learner successfully.

ishan16696 · 2023-07-14T11:49:11Z

In scale-up case where adding the current candidate as a learner is repeatedly attempted (upto 6 times). Similar thing should also be done when a restoration of a member in a multi-node cluster requires it to be added as a learner.

no, it doesn’t require as in case of scale-up we want to avoid going to the wrong path if adding a learner failed. That's why we throw a fatal error there.
But in this case backup-restore have detected the single member restoration correctly and will detects the single member restoration correctly even if its previous attempts failed to add a learner.
And no. of retries will be taken care by re-trigger of initialisation call if previous initialisation call failed.

ishan16696 · 2023-08-08T14:11:18Z

@unmarshall can you please close this issue if you are satisfied with this comments #645 (comment)

unmarshall · 2023-08-09T06:04:58Z

In a previous ticket we made a change where we attempt to add a learner a few times and then we give up and exit the container, resulting in restarting of a container. I discussed this with @shreyas-s-rao and we agreed that the earlier approach of trying to add-as-learner unlimited number of times was sufficient. A restart of a container does not alleviate this in any ways. So maybe it would probably make sense to remove the limit in both these situations and wait till an etcd-member is added as a learner (either in the case of a new member or a restart of an existing member in a multi-node cluster)

unmarshall added kind/bug Bug size/xs Size of pull request is tiny (see gardener-robot robot/bots/size.py) exp/beginner Issue that requires only basic skills labels Jul 14, 2023

gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Apr 17, 2024

gardener-robot added lifecycle/rotten Nobody worked on this for 12 months (final aging stage) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Dec 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restoration of a member in a multi-node cluster #645

Restoration of a member in a multi-node cluster #645

unmarshall commented Jul 14, 2023 •

edited

Loading

ishan16696 commented Jul 14, 2023 •

edited

Loading

ishan16696 commented Jul 14, 2023 •

edited

Loading

ishan16696 commented Aug 8, 2023

unmarshall commented Aug 9, 2023

Restoration of a member in a multi-node cluster #645

Restoration of a member in a multi-node cluster #645

Comments

unmarshall commented Jul 14, 2023 • edited Loading

ishan16696 commented Jul 14, 2023 • edited Loading

ishan16696 commented Jul 14, 2023 • edited Loading

ishan16696 commented Aug 8, 2023

unmarshall commented Aug 9, 2023

unmarshall commented Jul 14, 2023 •

edited

Loading

ishan16696 commented Jul 14, 2023 •

edited

Loading

ishan16696 commented Jul 14, 2023 •

edited

Loading