Optimize experimental Kafka scaler and fix consumer group logic #5697
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TL;DR:
I know we decided to implement the IAM auth in the sarama based scaler (the original) but there is still an optimization that would serve current users of the experimental scaler and potentially lower their Keda load and their broker.
Firstly, the Metadata request is not scoped for the requested topics (which can be empty):
keda/pkg/scalers/apache_kafka_scaler.go
Lines 425 to 427 in bcaf5c0
If the list of topics is empty, we enter the branch to detect the topics & permissions based on the consumer group activity here:
https://github.com/kedacore/keda/blob/bcaf5c07e785e3e58e3be4e3707b518fdef6acde/pkg/scalers/apache_kafka_scaler.go#L441C3-L441C14
On my AWS MSK cluster, the version for the response is not supported by the segment-io library which causes the following to error out, as seen in segmentio/kafka-go#1212 (probably):
Alright, for the case of debugging, let's ignore the error. What happens next? From what I can see, this variable
describeGrp
is unused, and because the MetadataRequest is empty, the whole list of topics is processed and returned with their partitions. All the time. Which then cause Keda to request all consumer & producer offsets again and again, etc...So to me, currently the behavior is buggy. The
describeGrp
should be processed to extract the list of topics and partitions for the consumer group which I also correct in this PR.Checklist
Fixes #
Relates to #5531