Skip to content

Conversation

@twmb
Copy link
Owner

@twmb twmb commented Nov 20, 2025

On very busy systems, it is possible for connections to be cut immediately after opening even if things are otherwise healthy and tls and sasl are configured properly. We now track EOF errors happening 3x in a row (on a given connection type), and only at that point mark the error as non-retryable.

This looks a bit weird with the deferred track closure, but doing it like this ensures we reset the EOF count on any success state.

Updates b2620e2.

@twmb twmb force-pushed the eof3x branch 7 times, most recently from 263c644 to 490ee2f Compare November 21, 2025 19:48
On very busy systems, it is possible for connections to be cut
immediately after opening even if things are otherwise healthy and tls
and sasl are configured properly. We now track EOF errors happening 3x
in a row (on a given connection type), and only at that point mark the
error as non-retryable.

This looks a bit weird with the deferred track closure, but doing it
like this ensures we reset the EOF count on any success state.

Updates b2620e2.
@twmb
Copy link
Owner Author

twmb commented Nov 21, 2025

not good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants