Skip to content

Conversation

@yliu-intuit
Copy link


Summary

This PR fixes a bug in the connection setup logic where HTTP/2 configurations (specifically http2_max_headers_list_bytes) were being ignored.

Root Cause Analysis

The issue stems from how the hyper_util builders were being used. The configure_connection helper function was returning an Http1Builder (the result of .http1()) rather than the parent auto::Builder.

By calling .serve_connection() on the child Http1Builder instance, the application effectively started a standard HTTP/1.1 server. This has two specific consequences:

  1. Protocol Sniffing Bypassed: The logic that detects the protocol (HTTP/1 vs HTTP/2) resides in the Parent builder. By serving from the Child, this detection logic is skipped.
  2. Configuration Ignored: Since the server was running as a strictly HTTP/1 instance, any configuration applied to the HTTP/2 builder (such as max_header_list_size) was dropped and never applied to the active connection.

Visualizing the Issue:

       [Current Buggy Flow]                        [Corrected Flow]

  auto::Builder (Parent)                      auto::Builder (Parent) 
  [Holds H2 Config]                           [Holds H2 Config] <--- Config Applied
         |                                           |
         +--> .http1()                               |
         |    [Child Builder]                        |
         |          |                                |
         |          +--> serve_connection()          +--> serve_connection()
         |               (Bypasses Parent)                (Uses Sniffing Logic)
         |               (H2 Config Lost)                 (H2 Config Active)
         |
         X (Path taken by code)

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • PR description explains the motivation for the change and relevant context for reviewing
  • PR description links appropriate GitHub/Jira tickets (creating when necessary)
  • Changeset is included for user-facing changes
  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Metrics and logs are added3 and documented
  • Tests added and passing4
    • Unit tests
    • Integration tests
    • Manual tests, as necessary

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. A lot of (if not most) features benefit from built-in observability and debug-level logs. Please read this guidance on metrics best-practices.

  4. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

@yliu-intuit yliu-intuit requested a review from a team as a code owner December 2, 2025 19:44
@apollo-cla
Copy link

@yliu-intuit: Thank you for submitting a pull request! Before we can merge it, you'll need to sign the Apollo Contributor License Agreement here: https://contribute.apollographql.com/

@yliu-intuit
Copy link
Author

Hi @aaronArinder @abernix , could you help take a look on this PR?

@aaronArinder
Copy link
Contributor

@yliu-intuit: this makes sense; let me validate it and respond in a bit! thanks for opening a pr

@aaronArinder
Copy link
Contributor

alas, I wasn't able to repro the error--it seems to work for me, but that doesn't mean it actually works, it'd be nice to see it working for you

here's a video running through it; let me know how to test it to make sure I'm tracking with the problem. Also let me know if you can't see the video for some reason

fwiw, I think the http1builder is fine to pass back because the http config has already been set and when the server is built, both configs are available via their respective builders--that's at least what I recollect from reading the hyper code a bit ago, but that might be wrong and I don't have time right now to dig back through it (hopefully seeing it working is enough proof, but again, I might be wrong about it actually working, so send over a repro!)

@yliu-intuit
Copy link
Author

Hi @aaronArinder , Thanks a lot for taking a look on this and try to reproduce it!! I just did a quick revalidation in v2.9. This is the curl command I am using. No matter how I adjust the config to 1kb or 48kb, the router would return 431 at exactly 16kb single header.
curl --http2-prior-knowledge -sS -D - \ -H 'content-type: application/json' \ --data '{"query":"{ __typename }"}' \ http://localhost:8090/graphql
It is using TCP not TLS in this case, matching our h2c protocol in the test environment.
Unfortunately, we have an event and I will be gone for the rest of the day, but I can send you the complete the script and possibly a video later tonight.

@aaronArinder
Copy link
Contributor

ohhh, yeah, interesting; without upgrading, maybe there's something there that's breaking (I've been kind of worried about the serve_connection_with_upgrades, but it predates the http2 header config change and I haven't dug too deeply into how it might fail)

without upgraded (using prior knowledge), I'm still able to get it to work for http2/431s--details in the top-right of the screenshot):

Screenshot 2025-12-03 at 3 56 02 PM

same, but for http2/200s:

Screenshot 2025-12-03 at 3 59 42 PM

let me know what else I can tweak to try and repro

@yliu-intuit
Copy link
Author

hmmm interesting @aaronArinder , on my side it is getting 431 with this command.
curl --http2-prior-knowledge -v --trace-time \ -H 'content-type: application/json' \ -H "X-Large: $(head -c 20480 </dev/zero | tr '\0' 'x')" \ --data '{"query":"{ __typename }"}' \ http://localhost:8090/graphql
with this config
limits: http1_max_request_headers: 1 http1_max_request_buf_size: 200kb http2_max_headers_list_bytes: 48kb
Unfortunately, my company laptop doesn't allow me to paste the screenshot, otherwise I can better demonstrate this

@aaronArinder
Copy link
Contributor

retrying with that curl and config doesn't quite do the dirt; so, I'm wondering if your router isn't getting the right config? It's hitting the default header size limit from hyper, making me think that maybe there's a stale config or it's somehow not getting the right values for the http2 header option?

@yliu-intuit
Copy link
Author

That's possible that the router doesn't get the config. but we have tested in different environments, in remote environment and in my laptop as well... I just tried again with this PR and the same config. The results are same, 2.9.0 is getting 431 and my fix get 200. Do you know how did Marcelo reproduced the issue?

@yliu-intuit
Copy link
Author

@aaronArinder I have included a recording inside our support ticket. https://support.apollographql.com/portal/1023/TSH-21029 maybe you can check that

@aaronArinder
Copy link
Contributor

@yliu-intuit: I just saw your video and am kind of confused at what might be wrong; it doesn't look like the config (from the logs, I think it's being taken up), so we probably need to try some other stuff:

  • can you try testing without the wrapping code; ie, just 2.9.0 or dev of this repo or your forked repo without your changes? that is, use the router as a binary rather than as a dependency; I'm curious if that works (it's what I'm doing, so hopefully it does), it might get us closer to figuring out where the problem lies (it might have something to do with the code surrounding the router)
  • can you run with tracing-level logs for both 2.9.0 and your branch? throw them in a file and send them over via the ticket, that might help us get to the bottom of things faster
    • use RUST_LOG="trace,jsonpath_lib=info", jsonpath_lib generates a bunch of traces

@yliu-intuit
Copy link
Author

@aaronArinder , sure, let me try that. Let me also try using TLS as well in my local.

@yliu-intuit
Copy link
Author

ok interesting @aaronArinder , when I build the router binary, it indeed worked with h2c. so what kind of surrounding code could go wrong. and what made this PR worked again with the surrounding code?

@yliu-intuit
Copy link
Author

let me run with trace level logging and see what happened

@yliu-intuit
Copy link
Author

Hi @aaronArinder , I have included the trace log inside our support tickets

@yliu-intuit
Copy link
Author

Hi @aaronArinder , more findings:

  1. with the latest dev branch, the HTTP2 header size is respected as you have shown giving 200
  2. however, if you switch to tag v2.9.0, the HTTP2 header size is not respected giving 431
  3. when I pulled latest dev inside the wrapper, HTTP2 header is respected giving 200

@yliu-intuit
Copy link
Author

I see, #8673 is not included in the v2.9.0 release

@aaronArinder
Copy link
Contributor

@yliu-intuit: ahhh! yeah, that'd do it. I forgot that I missed the cut date for 2.9.0 with that pr; it should make it into 2.10.0, which should be sometime this month! Sorry about that, if I'd slowed down to think about it I'd probably have remembered

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants