Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix out-of-order issues with ConsistentHashingStickyKeyConsumerSelector #23327

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Sep 20, 2024

Fixes #23315
Fixes #23321

Motivation

See #23315 and #23321. This PR fixes a bug in ConsistentHashingStickyKeyConsumerSelector where the selected consumer for a hash could change across 2 existing consumers when a consumer was removed or when a new consumer was added.
In addition it fixes the problem that ConsistentHashingStickyKeyConsumerSelector.getConsumerKeyHashRanges() returned invalid information. These are fixed together since they are in the same code area and fixing getConsumerKeyHashRanges is required for testing the consistent selection of a consumer when there are hash collisions.

Modifications

In PR #8396 a solution was added to fix a problem where all hash ring points were assigned to the last arrived consumer when there were hash collisions, for example when the consumers had the same name:

That solution causes problems since it causes the bug #23315. A different solution is needed to distribute the consumers evenly when there are hash collisions.

The solution in this PR is to assign a consumer with a "consumer name index". When there are multiple consumers with the same name, the name index will be unique. Since any consumer name could be potentially used by consumers arriving later, it is necessary to track all consumer names. When all consumers have been removed for a specific consumer name, it shouldn't result in a memory leak. This is the reason why tracking reference counts is handled in the implementation.

The implementation contains a solution where a consumer leaving and joining immediately will get assigned with the same "consumer name index". This improves stability since the hash assignment changes are minimized over time, although a better solution would be to avoid reusing the same consumer name in the first place.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@lhotari lhotari added this to the 4.0.0 milestone Sep 20, 2024
@lhotari lhotari self-assigned this Sep 20, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Sep 20, 2024
@lhotari lhotari marked this pull request as draft September 20, 2024 12:53
@lhotari lhotari force-pushed the lh-fix-ConsistentHashingStickyKeyConsumerSelector-select-bug branch from ac7fdeb to cc78aec Compare September 20, 2024 13:34
@lhotari lhotari marked this pull request as ready for review September 20, 2024 13:36
@lhotari lhotari changed the title [fix][broker] Fix out-of-order issue with ConsistentHashingStickyKeyConsumerSelector [fix][broker] Reduce out-of-order issues with ConsistentHashingStickyKeyConsumerSelector Sep 20, 2024
@lhotari
Copy link
Member Author

lhotari commented Sep 20, 2024

I renamed this to "reduce" out-of-order issues since I'm not sure if the current model in this PR prevents the case that a consumer gets swapped when an unrelated consumer leaves.
I was able to get the behavior fairly consistent, but it's not going to prevent all cases.

It seems that another hash ring layer is needed for the colliding consumers. That would prevent unnecessary changes. A hash ring inside a hash ring...

@lhotari lhotari marked this pull request as draft September 20, 2024 15:30
@lhotari lhotari force-pushed the lh-fix-ConsistentHashingStickyKeyConsumerSelector-select-bug branch from cc78aec to 6a1be71 Compare September 20, 2024 16:29
@lhotari lhotari marked this pull request as ready for review September 20, 2024 16:30
@lhotari lhotari marked this pull request as draft September 20, 2024 16:37
@lhotari lhotari force-pushed the lh-fix-ConsistentHashingStickyKeyConsumerSelector-select-bug branch 5 times, most recently from 06db6e3 to bae30bc Compare September 20, 2024 21:28
@lhotari lhotari changed the title [fix][broker] Reduce out-of-order issues with ConsistentHashingStickyKeyConsumerSelector [fix][broker] Fix out-of-order issues with ConsistentHashingStickyKeyConsumerSelector Sep 20, 2024
@lhotari lhotari force-pushed the lh-fix-ConsistentHashingStickyKeyConsumerSelector-select-bug branch from 80d6451 to 22b95a7 Compare October 1, 2024 12:34
@lhotari
Copy link
Member Author

lhotari commented Oct 1, 2024

Hash ranges are very rare with the current hash range size of 2^31-1 (Integer.MAX_VALUE).
In PIP-379, the hash range size is reduced to 2^16-1 (65535). In that case, hash range conflicts are common.
However, due to the large amount of hash ring points, there's no significant difference in distributions due to the conflicts.

Copy link
Contributor

@equanz equanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari merged commit adb9014 into apache:master Oct 2, 2024
51 checks passed
lhotari added a commit that referenced this pull request Oct 4, 2024
lhotari added a commit that referenced this pull request Oct 4, 2024
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 15, 2024
…ConsumerSelector (apache#23327)

(cherry picked from commit adb9014)
(cherry picked from commit 7f8e508)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 16, 2024
…ConsumerSelector (apache#23327)

(cherry picked from commit adb9014)
(cherry picked from commit 7f8e508)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants