-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent server-side error in ServerStreamTracer vs ServerInterceptor #8120
Comments
Edit, updated ticket to reflect inconsistency between ServerStreamTracer and ServerInterceptor (rather than StreamObserver as previously mentioned). |
On the server call layer, it does not preserve the error/reason of the call being terminated. The call is either finished normally (client side half-close) or cancelled (anything other than normal half-close is considered as cancellation). Server call canceling does not propagate the error/reason ( gRPC is based on HTTP/2, which is designed for client-server communication model (instead of peer-to-peer). It is inherently asymmetric on the two sides. From a server application's perspective, anything happens in the middle of a HTTP request that causes the request to be terminated is a cancellation, no matter if it is caused by timeout, explicit cancellation by the client, network errors, etc. Like HTTP servers, server applications usually do not care the reason of requests being cancelled. ServerStreamTracer works in a lower level than the server call. It directly interacts with the stream and HTTP traffic. So it knows what's happening on the network and the status seen by ServerStreamTracer tells the real state of the network.
Seen by consumers from different perspective (server application vs. traffic monitoring), the two layers present different levels of state information. Server calls and streams are of a one-on-one mapping, but the status they deliver is minimally correlated (as server calls throw away the status information and anything other than normal termination is a cancellation). |
Thanks for the insights @voidzcy . I think I'm understanding the different sources of the different Status values now. If I'm understanding it correctly, I think I have a follow up question/request, which is that it would be nice inside |
The Status that pops through Status is being used general-purposely for carrying end-of-life information. The Status code itself doesn't mean to indicate anything about whether something is happening locally or from the network. The description/cause of Status does play a significant role and should not be ignore. There is a convention that we follow and also suggested to users: never use a Status without description/cause. |
Closing, no action is needed for the current work-as-intended behaviors. We would be improving some doc according to the discussion after #7558. |
What version of gRPC-Java are you using?
1.37.0
What is your environment?
Linux, OpenJDK 1.8
openjdk version "1.8.0_282"
OpenJDK Runtime Environment (build 1.8.0_282-8u282-b08-0ubuntu1~20.04-b08)
OpenJDK 64-Bit Server VM (build 25.282-b08, mixed mode)
What did you expect to see?
In a bidirectional streaming service, I'm observing different a status passed to
ServerCall.close(Status, Metadata)
versus the status passed toServerStreamTracer.streamClosed(Status)
when a misbehaving client is abruptly terminating the connection (without sending any sort of cancel or reset).What did you see instead?
The server has a logging interceptor which intercepts the
ServerCall.close(Status, Metadata)
method and logs the status. The status seen here isStatus{code=CANCELLED, description=Cancelling request because of error from client., cause=null}
. We also have a ServerStreamTracer installed on the server, but the Status passed toServerStreamTracer.streamClosed(Status)
isStatus{code=UNAVAILABLE, description=connection terminated for unknown reason, cause=null}
.Steps to reproduce the bug
I have a minimal reproducer with some instructions here.
Comments
If the client is terminating the connection then I suppose there really is no true status (either from the server or from the client). So I could be convinced that either of these Status codes would be valid. What really threw off our metrics and sent a couple of engineers down a debugging rabbit hole is that we couldn't correlate any request logs on the server (being logged from the ServerInterceptor) with the status metrics being emitted from the ServerStreamTracer. So I think regardless of what Status is ultimately reported for a abruptly closed connection, it should at least be consistent between these two sources.
This may also be related to #7558 .
The text was updated successfully, but these errors were encountered: