[WIP] A prototype implementation to improve startup performance #3821
+103
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
#3326
Description
DON'T MERGE THIS
Built image for test:
m00nf1sh/aws-load-balancer-controller:v2.8.3-checkpoint
Intro
This is a prototype implementation to improve performance during controller startup. The root cause is during restart(e.g. leadership change or hardware issues on node), the controller will reconcile all existing Ingress and Services in cluster. In some large clusters with a lot Ingress/Services, this could take a long time due to AWS API throttles, thus impacting the ability to handle other events(e.g. pod deployments).
Design
The idea is to save the "last reconciled state" as annotations into Ingress and Service objects, (potentially TargetGroupBindings) as well. So during controller restart, it can compare the current state vs "last reconciled state"(from annotation), and skip reconcile on already reconciled resources.
With this implementation, the "last reconciled state" is computed as sha256 of the ELBv2 JSON model built for Ingress and Services. For TargetGroupBindings, it can be the list of current backend targets(TODO, need some refactor).
Alternative design considered:
Next Steps
Checklist
README.md
, or thedocs
directory)BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯