Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

Knative shard allocation strategy #386

Open
jroper opened this issue Jul 7, 2020 · 0 comments
Open

Knative shard allocation strategy #386

jroper opened this issue Jul 7, 2020 · 0 comments

Comments

@jroper
Copy link
Member

jroper commented Jul 7, 2020

For now just creating this issue to capture/publish some discussion I've had with the Akka maintainers. Looking at integration with Knative, one thing that undermines Knatives approach to deployment is cluster sharding - Knative allows specifying how much traffic to route to each revision of a deployment, but if we then go and balance shards equally across all nodes for all revisions, then it won't really work. This would also cause big problems for example if trying to use CPU based autoscaling, since the sharding will cause nodes that no longer should be receiving any traffic to still have significant CPU usage and so they'll never be scaled down, and therefore will always have load.

As a solution to this, we can create a shard allocation strategy that will match Knatives routing configuration. Here's the approach that the Akka maintainers have agreed would be good to try. It's in two parts:

  1. Weighted shard allocation strategy. This is a shard allocation strategy that has a pluggable weighting strategy. The weighting strategy, given a list of nodes (and perhaps the current number of shard allocations for those nodes, see below), should return a list of weights for those nodes, which is a number which represents the share of shards that node should be. Eg, it might say node one has a weight of 4, node two a weight of 5, and node three a weight of 3, total shares is 12, so node one gets 4/12s of shards, node two 5/12s, and node 3 3/12s. Allocation decisions are made by first working out which node has the biggest deficit of shards allocated to it (ie, if it should currently have 10 shards, but only has 3, then it currently has a deficit of 7 shards). If two nodes have the same deficit, then the node with the bigger share allocated to it gets the shard. Rebalancing decisions are made by taking the node with the biggest shard deficit and the node with the biggest shard surplus, and transferring a capped number of shards between them to balance them out. If two nodes have equal deficits or surpluses, then the node with the least or most shards is taken respectively. Balancing stops when there are either only deficits of 1, or surpluses of 1 (that number might be configurable) - the stable state will often have left over deficits and surpluses due to rounding.

    This will allow either push or pull based weighting strategies, and I guess this strategy might generally useful in a much broader field of use, so might be a good candidate for inclusion in cluster sharding core.

  2. Knative weighting strategy. This will poll Knative for the routing configuration and keep a cache of it, and will expect each node to have a role that indicates which Knative revision it's part of, and will return the weights accordingly. If the the routing configuration significantly disagrees with the current state of the cluster (for example, if all the nodes passed to the strategy have no traffic routed to them and all the revisions that have traffic routed to them have no nodes currently in the cluster) then it might just return the shares according to how they are currently allocated, so everything keeps its current share - this is the reason why above I said the strategy may be passed the current number of shard allocations for each node.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant