perf(transport): auto-tune stream receive window #1868

mxinden · 2024-05-02T10:22:24Z

Previously the stream send and receive window had a hard limit at 1MB. On high latency and/or high bandwidth connections (i.e. large bandwidth-delay product), 1 MB is not enough to exhaust the available bandwidth.

Sample scenario:

delay_s = 0.05
window_bits = 1 * 1024 * 1024 * 8
bandwidth_bits_s = window_bits / delay_s
bandwidth_mbits_s = bandwidth_bits_s / 1024 / 1024 # 160.0

In other words, on a 50 ms connection a 1 MB window can at most achieve 160 Mbit/s.

This commit introduces an auto-tuning algorithm for the stream receive window, increasing the window towards the bandwidth-delay product of the connection.

Fixes #733.

This commit adds a basic smoke test using the `test-ficture` simulator, asserting that on a connection with unlimited bandwidth and 50ms round-trip-time Neqo can eventually achieve > 1 Gbit/s throughput. Showcases the potential a future stream flow-control auto-tuning algorithm can have. See mozilla#733.

Previously the stream send and receive window had a hard limit at 1MB. On high latency and/or high bandwidth connections, 1 MB is not enough to exhaust the available bandwidth. Sample scenario: ``` delay_s = 0.05 window_bits = 1 * 1024 * 1024 * 8 bandwidth_bits_s = window_bits / delay_s bandwidth_mbits_s = bandwidth_bits_s / 1024 / 1024 # 160.0 ``` In other words, on a 50 ms connection a 1 MB window can at most achieve 160 Mbit/s. This commit introduces an auto-tuning algorithm for the stream receive window, increasing the window towards the bandwidth-delay product of the connection.

github-actions · 2024-05-02T10:27:31Z

Failed Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest vs. mvfst: DC

All results

QUIC Interop Runner, client vs. server

Succeeded Interop Tests

Unsupported Interop Tests

chrome vs. neqo-latest: H DC

github-actions · 2024-05-07T10:31:51Z

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 922d266.

neqo-latest as client

neqo-latest as server

lsquic vs. neqo-latest: run cancelled after 20 min
msquic vs. neqo-latest: Z U
mvfst vs. neqo-latest: Z A L1 C1
quinn vs. neqo-latest: V2
xquic vs. neqo-latest: M

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: H DC LR C20 M S R 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. go-x-net: H DC LR M B U A L2 C2 6
neqo-latest vs. haproxy: H DC LR C20 M S R Z 3 B U A L1 L2 🚀C1 C2 6 V2
neqo-latest vs. kwik: H DC LR C20 M S R ⚠️Z 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. lsquic: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. msquic: H DC LR C20 M S R Z B U L1 L2 🚀C1 C2 6 V2
neqo-latest vs. mvfst: H DC LR M R Z 3 B U L2 ⚠️C1 C2 6
neqo-latest vs. neqo: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. nginx: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-latest vs. ngtcp2: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. picoquic: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. quic-go: H DC LR C20 M S R Z 3 B U A 🚀L1 L2 C1 C2 6
neqo-latest vs. quiche: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-latest vs. quinn: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6
neqo-latest vs. s2n-quic: H DC LR C20 M S R 3 B U E A L1 L2 C1 C2 6
neqo-latest vs. xquic: H DC LR C20 M R Z 3 B U 🚀L1 L2 🚀C1 C2 6

neqo-latest as server

aioquic vs. neqo-latest: H DC LR C20 M S R Z 3 B A L1 L2 C1 C2 6 V2
chrome vs. neqo-latest: 3
go-x-net vs. neqo-latest: H DC LR M B U A L2 C2 6
kwik vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
msquic vs. neqo-latest: H DC LR C20 M S R B A L1 L2 C1 C2 6 V2
mvfst vs. neqo-latest: H DC LR M 3 B L2 C2 6
neqo vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
ngtcp2 vs. neqo-latest: H DC LR ⚠️C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
picoquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
quic-go vs. neqo-latest: ⚠️H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
quiche vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6
quinn vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6
s2n-quic vs. neqo-latest: H DC LR M S R 3 B E A L1 L2 C1 C2 6
xquic vs. neqo-latest: H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: E
neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2
neqo-latest vs. haproxy: E
neqo-latest vs. kwik: E
neqo-latest vs. msquic: 3 E
neqo-latest vs. mvfst: C20 S E V2
neqo-latest vs. nginx: E V2
neqo-latest vs. quic-go: E V2
neqo-latest vs. quiche: E V2
neqo-latest vs. quinn: V2
neqo-latest vs. s2n-quic: Z V2
neqo-latest vs. xquic: S E V2

neqo-latest as server

aioquic vs. neqo-latest: U E
chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2
go-x-net vs. neqo-latest: C20 S R Z 3 E L1 C1 V2
kwik vs. neqo-latest: E
msquic vs. neqo-latest: 3 E
mvfst vs. neqo-latest: C20 S R U E V2
quic-go vs. neqo-latest: ⚠️E V2
quiche vs. neqo-latest: C20 U E V2
s2n-quic vs. neqo-latest: C20 Z U V2
xquic vs. neqo-latest: E V2

github-actions · 2024-05-07T12:24:45Z

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

Linux: Debug Release
macOS: Debug Release
Windows: Debug Release

This commit adds a basic smoke test using the `test-fixture` simulator, asserting the expected bandwidth on a 1 gbit link. Given mozilla#733, the current expected bandwidth is limited by the fixed sized stream receive buffer (1MiB).

A `Node` (e.g. a `Client`, `Server` or `TailDrop` router) can be in 3 states: ``` rust enum NodeState { /// The node just produced a datagram. It should be activated again as soon as possible. Active, /// The node is waiting. Waiting(Instant), /// The node became idle. Idle, } ``` `NodeHolder::ready()` determines whether a `Node` is ready to be processed again. When `NodeState::Waiting`, it should only be ready when `t <= now`, i.e. the waiting time has passed, not `t >= now`. ``` rust impl NodeHolder { fn ready(&self, now: Instant) -> bool { match self.state { Active => true, Waiting(t) => t <= now, // not >= Idle => false, } } } ``` The previous behavior lead to wastefull non-ready `Node`s being processed and thus a large test runtime when e.g. simulating a gbit connection (mozilla#2203).

codecov · 2024-12-29T19:09:10Z

Codecov Report

Attention: Patch coverage is 93.65079% with 24 lines in your changes missing coverage. Please review.

Project coverage is 93.34%. Comparing base (922d266) to head (4d70b34).

Files with missing lines	Patch %	Lines
neqo-transport/src/fc.rs	87.76%	15 Missing and 8 partials ⚠️
neqo-transport/src/connection/mod.rs	94.11%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main    #1868    +/-   ##
========================================
  Coverage   93.33%   93.34%            
========================================
  Files         114      114            
  Lines       36896    37177   +281     
  Branches    36896    37177   +281     
========================================
+ Hits        34438    34703   +265     
- Misses       1675     1680     +5     
- Partials      783      794    +11

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mxinden · 2024-12-31T12:11:44Z

neqo-transport/src/send_stream.rs

-pub const SEND_BUFFER_SIZE: usize = 0x10_0000; // 1 MiB
+const MAX_SEND_BUFFER_SIZE: usize = 10 * 1024 * 1024;


Previously Neqo would buffer at most 1 MB of send data. Now Neqo buffers up to 10 MB. In other words, it supports an up to 10 MB large send window, depending on the receive window updates of the receiver.

Thus, while this pull request is focused on increasing receive (download) throughput, this patch might as well have an impact on send (upload) throughput on high bandwidth-delay product connections.

Concrete const value up for discussion. On a 50 ms connection a 10 MB window can achieve 1.6 Gbit/s.

mxinden · 2024-12-31T12:20:42Z

neqo-transport/src/send_stream.rs

@@ -494,10 +494,10 @@ impl TxBuffer {

    /// Attempt to add some or all of the passed-in buffer to the `TxBuffer`.
    pub fn send(&mut self, buf: &[u8]) -> usize {
-        let can_buffer = min(SEND_BUFFER_SIZE - self.buffered(), buf.len());
+        let can_buffer = min(MAX_SEND_BUFFER_SIZE - self.buffered(), buf.len());


Note that while we increase the send buffer up to MAX_SEND_BUFFER_SIZE, it is never shrunk. My rational:

The majority of streams are short lived. In other words, even if a stream reaches a send buffer of MAX_SEND_BUFFER_SIZE, the buffer is soon de-allocated.

For long lived buffers reaching MAX_SEND_BUFFER_SIZE, my assumption is, that MAX_SEND_BUFFER_SIZE is chosen conservative enough, that the additional allocation doesn't hurt.

Intuitively any shrinking heuristic likely leads to more memory churn, rather than decreasing resident memory.

Thoughts?

mxinden · 2024-12-31T12:27:02Z

neqo-transport/src/fc.rs

+        // Auto-tune max_active, i.e. the flow control window.
+        //
+        // If the sending rate ( window_bytes used / elapsed ) exceeds the rate
+        // allowed by the maximum flow control window and the current rtt (
+        // max_active / rtt ), try to increase the maximum flow control window (
+        // max_active ).
+        if let Some(max_allowed_sent_at) = self.max_allowed_sent_at {
+            let elapsed = now.duration_since(max_allowed_sent_at);
+            let window_bytes_used = self.max_active - (self.max_allowed - self.retired);
+
+            // Same as `elapsed / rtt < window_bytes_used / max_active`
+            // without floating point division.
+            if elapsed.as_micros() * u128::from(self.max_active)
+                < rtt.as_micros() * u128::from(window_bytes_used)
+            {
+                let prev_max_active = self.max_active;
+                // Try doubling the flow control window.
+                //
+                // Note that the flow control window should grow at least as
+                // fast as the congestion control window, in order to not
+                // unnecessarily limit throughput.
+                self.max_active = min(2 * self.max_active, MAX_RECV_WINDOW_SIZE);
+                qdebug!(
+                    "Increasing max stream receive window: previous max_active: {} MiB new max_active: {} MiB last update: {:?} rtt: {rtt:?} stream_id: {}",
+                    prev_max_active / 1024 / 1024, self.max_active / 1024 / 1024,  now-self.max_allowed_sent_at.unwrap(), self.subject,
+                );
+            }
+        }


Auto-tuning is executed right before sending a window update.

A window update is sent either:

When WINDOW_UPDATE_FRACTION is reached, see fn should_send_flowc_update above.

The remote sends a STREAM_DATA_BLOCKED.

mxinden · 2024-12-31T12:32:33Z

neqo-transport/src/fc.rs

+        // Auto-tune max_active, i.e. the flow control window.
+        //
+        // If the sending rate ( window_bytes used / elapsed ) exceeds the rate
+        // allowed by the maximum flow control window and the current rtt (
+        // max_active / rtt ), try to increase the maximum flow control window (
+        // max_active ).
+        if let Some(max_allowed_sent_at) = self.max_allowed_sent_at {
+            let elapsed = now.duration_since(max_allowed_sent_at);
+            let window_bytes_used = self.max_active - (self.max_allowed - self.retired);
+
+            // Same as `elapsed / rtt < window_bytes_used / max_active`
+            // without floating point division.
+            if elapsed.as_micros() * u128::from(self.max_active)
+                < rtt.as_micros() * u128::from(window_bytes_used)
+            {
+                let prev_max_active = self.max_active;
+                // Try doubling the flow control window.
+                //
+                // Note that the flow control window should grow at least as
+                // fast as the congestion control window, in order to not
+                // unnecessarily limit throughput.
+                self.max_active = min(2 * self.max_active, MAX_RECV_WINDOW_SIZE);
+                qdebug!(
+                    "Increasing max stream receive window: previous max_active: {} MiB new max_active: {} MiB last update: {:?} rtt: {rtt:?} stream_id: {}",
+                    prev_max_active / 1024 / 1024, self.max_active / 1024 / 1024,  now-self.max_allowed_sent_at.unwrap(), self.subject,
+                );
+            }
+        }


Note that this is not the exact algorithm suggested by @martinthomson in #733 (comment).

The algorithm proposed in this pull request adopts Martin's trigger mechanism, namely to increase the window based on the perceived BDP.

Therefore, I suggest that if the rate at which self.retired increases (that is, the change in that value, divided by the time elapsed) exceeds some function of self.max_active / path.rtt,

It does not adopt the increase mechanism, i.e. to increase by the amount of retired data. Instead, the window is simply doubled.

then we can increase self.max_active by the amount that self.retired has increased.

The rational is documented above.

// Try doubling the flow control window. // // Note that the flow control window should grow at least as // fast as the congestion control window, in order to not // unnecessarily limit throughput.

mxinden added 2 commits April 25, 2024 16:43

mxinden added 3 commits May 7, 2024 11:39

clippy

1bfd2f5

Merge branch 'main' of https://github.com/mozilla/neqo into auto-tuning

167d93f

fix tests

edc4035

mxinden mentioned this pull request May 8, 2024

refactor: enable mozilla-central http3server to use neqo-bin #1878

Merged

mxinden added 17 commits May 14, 2024 15:49

Don't increase on STREAM_DATA_BLOCKED

12390c8

enforce STREAM_MAX_ACTIVE_LIMIT

0ad1b77

add TODO starting below 1MiB

8e6a5af

Add NonRandomDelay

716ba20

Reduce transfer amount and throughput expectation

3b2a52b

Merge branch 'main' of https://github.com/mozilla/neqo into auto-tuning

dba8190

Merge branch 'main' of https://github.com/mozilla/neqo into auto-tuning

9b9524e

debugging

ac5d024

Merge branch 'main' of https://github.com/mozilla/neqo into auto-tuning

22f1d7e

deduplicate in fc.rs

db3cfe5

Google vs Thomson vs BDP

62dc2ba

More testing

841d086

Adjust bench

5ec58cb

Move to benchmark

6dd3829

Add TODO on sent rtt in receive auto-tuning

f9b9a27

larseggert mentioned this pull request Dec 4, 2024

Add stats about how often a connection is blocked because of the flow control limits #790

Open

mxinden added 3 commits December 13, 2024 11:53

update every rtt/4 and don't use floating point division

8aff641

Settle on Thomson algorithm

4a146e6

More aggressive increase

5e7d7d6

mxinden added 2 commits December 29, 2024 20:01

Outdated comment

1e3dd80

Document Mtu node impl

2934ffa

mxinden changed the title ~~feat: auto-tune stream receive window~~ perf(transport): auto-tune stream receive window Dec 31, 2024

mxinden added 2 commits December 31, 2024 12:53

Refactor fc_state_recv_7

8e0c6d8

Remove todo to shrink buffer

4efa2f2

mxinden commented Dec 31, 2024

View reviewed changes

Remove outdated todo

e68813e

mxinden commented Dec 31, 2024

View reviewed changes

mxinden added 13 commits December 31, 2024 14:27

Add max_send_buffer_size test

e808b51

Merge remote-tracking branch 'mozilla/main' into auto-tuning

57c2870

Clippy

ba2c321

Clippy

6de63fb

Plan tests

aba14e9

Add tests

4ee45fc

Fix min_bandwidth

82eb50b

Add quick check style test

fb51122

Interleaving test

1fa5a77

Rational for MAX_RECV_WINDOW_SIZE

6cb8de9

Clippy

f10cf32

Fix test

8419c3f

Fix intra doc links

5724444

mxinden marked this pull request as ready for review January 4, 2025 15:30

mxinden requested review from KershawChang, martinthomson and larseggert as code owners January 4, 2025 15:30

Expose raw bandwidth

4d70b34

mxinden mentioned this pull request Jan 5, 2025

test(transport): assert maximum bandwidth on gbit link #2203

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(transport): auto-tune stream receive window #1868

perf(transport): auto-tune stream receive window #1868

mxinden commented May 2, 2024 •

edited

Loading

github-actions bot commented May 2, 2024

Succeeded Interop Tests

Unsupported Interop Tests

github-actions bot commented May 7, 2024 •

edited

Loading

Succeeded Interop Tests

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

neqo-latest as client

neqo-latest as server

github-actions bot commented May 7, 2024 •

edited

Loading

codecov bot commented Dec 29, 2024 •

edited

Loading

mxinden Dec 31, 2024 •

edited

Loading

mxinden Dec 31, 2024

mxinden Dec 31, 2024

mxinden Dec 31, 2024

		pub const SEND_BUFFER_SIZE: usize = 0x10_0000; // 1 MiB
		const MAX_SEND_BUFFER_SIZE: usize = 10 * 1024 * 1024;

perf(transport): auto-tune stream receive window #1868

Are you sure you want to change the base?

perf(transport): auto-tune stream receive window #1868

Conversation

mxinden commented May 2, 2024 • edited Loading

github-actions bot commented May 2, 2024

Failed Interop Tests

Succeeded Interop Tests

Unsupported Interop Tests

github-actions bot commented May 7, 2024 • edited Loading

Failed Interop Tests

neqo-latest as client

neqo-latest as server

Succeeded Interop Tests

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

neqo-latest as client

neqo-latest as server

github-actions bot commented May 7, 2024 • edited Loading

Firefox builds for this PR

codecov bot commented Dec 29, 2024 • edited Loading

Codecov Report

mxinden Dec 31, 2024 • edited Loading

Choose a reason for hiding this comment

mxinden Dec 31, 2024

Choose a reason for hiding this comment

mxinden Dec 31, 2024

Choose a reason for hiding this comment

mxinden Dec 31, 2024

Choose a reason for hiding this comment

mxinden commented May 2, 2024 •

edited

Loading

github-actions bot commented May 7, 2024 •

edited

Loading

github-actions bot commented May 7, 2024 •

edited

Loading

codecov bot commented Dec 29, 2024 •

edited

Loading

mxinden Dec 31, 2024 •

edited

Loading