-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gRPC thread pool saturated by blocked Identity calls #18697
Labels
component/zeebe
Related to the Zeebe component/team
kind/toil
Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
Comments
npepinpe
added
kind/toil
Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
component/zeebe
Related to the Zeebe component/team
labels
May 22, 2024
ZDP:
I will ping the Identity team about having asynchronous alternatives for their SDK. |
Update on Identity, they're open to it but don't see a big need, and we may want to wait for the mono repo merge before looking into it. Though if we move to virtual thread, there is no real pressure to do any async things here. |
github-merge-queue bot
pushed a commit
that referenced
this issue
Jun 3, 2024
## Description ## Related issues closes #18697
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
component/zeebe
Related to the Zeebe component/team
kind/toil
Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
Description
We use a custom instance of the
ForkJoinPool
in the gateway to handle our gRPC service logic. Meaning, a gRPC request comes in, it's passed to the Netty event executor loop, and then when it's ready to execute the business logic (e.g.EndpointManager
), it's passed to ourForkJoinPool
.This pool is configurable by users, with a minimum/maximum number of threads. By default, the parallelism level (core pool size) is the number of CPU cores, and the max number of threads is twice the number of CPU cores.
This works fine with our normal code, since all of it is asynchronous. However, we do use the Identity SDK to retrieve credentials and tenant IDs, which performs some HTTP calls. These calls are blocking, and will end up blocking the
ForkJoinPool
thread usingForkJoinPool#managedBlocker
. This causes the current thread to be marked as blocked, and the pool to try and compensate this by spawning another thread (to maintain the level of parallelism).However, if you've reached the maximum number of threads (e.g. 2 * cpuCores), then an exception is thrown, something like this:
Since we're clearly blocked on I/O, I think we could likely compensate and spawn more threads (of course, eventually too many threads is an issue of its own). But this leads us to two options:
ForkJoinPool
allows256 + Math.max(1, Runtime.getRuntime().availableProcessors() - 1)
. The downside here is, I honestly can't tell why 256. The actual bound for the max number of threads is0x7fff
, any other value given will be clamped down to this. From their docs:I would lean towards option 2, but there's still the downside that if users configure a small number of threads, and then will run into this error. But we could start with that - change the default of the max number of gRPC threads to be
256 + Math.max(1, Runtime.getRuntime().availableProcessors() - 1)
, and still let users completely override it if they want.The text was updated successfully, but these errors were encountered: