Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ginepro does not load 'balance' #39

Open
conradludgate opened this issue Jan 9, 2023 · 4 comments
Open

ginepro does not load 'balance' #39

conradludgate opened this issue Jan 9, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@conradludgate
Copy link
Contributor

Bug description

ginepro indirectly uses tower::Balance which makes use a best-of-two random load balancing strategy.

Unfortunately, tonic has no built in mechanism for determining load, so this is hard coded to be 0 always.

https://docs.rs/tonic/0.8.3/src/tonic/transport/service/connection.rs.html#114-120

This means we have randomise distribution. This is still a balancing technique but is known to not be very good

@conradludgate conradludgate added the bug Something isn't working label Jan 9, 2023
@conradludgate
Copy link
Contributor Author

We don't need to use tonic's Channel abstraction since Grpc client makes use of tower services directly. That let's us implement this logic ourselves.

Now for how we communicate load. An idea I had is to have a well-known header in responses for how much load the server is currently experiencing.

This can be cached by the client. In high load situations this will be accurate. In low load situations it will be less accurate but arguably less important.

We would need to decide on a good header for this.

An alternative is to have an alternative well known load look up RPC. We can call that on a regular basis to update load values

If the load endpoint or header doesn't exist, we can stop checking for it and fallback on the original strategy of random or round robin. This can be configured dynamically

@ThomWright
Copy link
Member

Have you seen this which seems to be what Linkerd use?

@conradludgate
Copy link
Contributor Author

conradludgate commented Jan 10, 2023

Using latency is clever, but I'm not sure exactly how the pending calculation works. I'm probably missing something.

Ahh I see. It's not 'pending', it's 'in flight requests'.

So load=latency*in_flight.

I have no objections to this, since it makes it only a client side approach.

@xmakro
Copy link
Contributor

xmakro commented Aug 26, 2023

It would be good to mention this in the docs. Currently ginepro doc says it's using using The Power of Two Choices, which is misleading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants