Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break down errors by extension codes #6381

Open
n1ru4l opened this issue Jan 17, 2025 · 1 comment
Open

Break down errors by extension codes #6381

n1ru4l opened this issue Jan 17, 2025 · 1 comment

Comments

@n1ru4l
Copy link
Contributor

n1ru4l commented Jan 17, 2025

Background

We gather information about GraphQL requests that contain an error response (errors key is present within the GraphQL response).
This information is used to display the success and failure rate of GeaphQL request within the Hive Console dashboard, insight view.

Total failure rate

Image

Failure rate compared to success rate

Image

Failure rate by operation

Image

A common practice is to use the code property within the error extensions to identify the origin of the error.

E.g. if a access token sent to a GraphQL API is expired, the following response could be interpreted by the frontend client for refreshing the access token.

{
  "errors": [
    {
       "message": "AccessToken expired",
       "extensions": { "code":  "NEEDS_REFRESH" }
    }
  ]
}

Or if a request was made without an access token, the following response could be interpreted by the frontend client for showing a log-in form.

{
  "errors": [
    {
       "message": "AccessToken expired",
       "extensions": { "code":  "UNAUTHENTICATED" }
    }
  ]
}

Apollo Server initially popularized this pattern and has become an unofficial standard for handling arbitrary errors that are not defined using the Schema SDL.


Task Overview

Users of Hive want to see the reason the reason on why the execution of a request failed without using additional tools. An enhancement would be to include the error code(s) within the usage reporting.

The following things need to be done:

  1. Adjust usage service to support sending error codes to the usage reporting API for the v2 protocol
  2. Adjust the Hive SDKs for extracting the error codes from an error response and include them in the JSON payload sent to the usage reporting API (also include a way to customize the extraction of the error codes; sending error codes should be opt-in)
  3. Figure out how to store the error code data within our Clickhouse database
  4. Figure out how to display the information within the hive insights dashboard
  • Breakdown of all error codes
  • Breakdown of error codes per operation within the operation details view

Links

@thomas-colgrove
Copy link

Big upvote for this issue! This would be incredibly useful to my team. Having this functionality would help us determine what type of errors we're seeing in prod and determine whether theyre from expected user behavior or from backend bugs we need to fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants