You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi.
This is a tricky subject to understand, but it is a regression introducedy by a80c248
present in urllib3>=1.26.16
That commit causes poolmanager to clear its state (namely connections) based on the whims of the garbage collector.
However, there is a pretty simple scenario where references to poolmanager can be kept alive for too long and code that would reliably close all connections and clear all state no longer works.
I happened to stumble upon this with a test file which tests for error handling using the requests library which internally uses urllib3.
Basically I trigger errors inside the urlib3 code by using a local mock http server which misbehaves.
The urllib3 code then gets a RemoteDisconnected() exception from http/client.py", line 287, in _read_status. The exception includes the traceback together with outer stack frames. The traceback keeps all objects in the entire call chain pinned in memory.
Then this exception is saved inside a Retries object in connectionpool.py, here
Another location is in urllib3/util/retry.py where a retry object is created with the history of failed requests, which also saves exceptions
new_retry = self.new(
[...]
history=history,
)
Basically, any exception being triggered causes this problem.
To put the cherry on top, these messages are logged, and when using pytest (or another testing framework), it happily collects all log messages (LogRecord objects) for reporting after the test finishes, which in turn keep the retry objects alive inside LogRecord.args. So during a unit test, nothing gets garbage collected.
Actual Behavior
So below is a smaller version of the test I had which gets stuck indefinetly. It gets stuck in the HTTPServer.shutdown() call because it waits for the serve_forever loop to quit and set a threading.Event. However, that loop is stuck because the socket and connection are still open, because the session and connections have not been garbage collected.
Expected Behavior
Although exceptions can be saved, their tracebacks should be cleared. Every location where an exception is saves as property of some object, the stacktrace should be cleared with ex.with_traceback(None).
Steps to Reproduce
This test hangs.
import logging
import threading
import pytest
import requests.adapters
logger = logging.getLogger(__name__)
class HttpServerControl:
force_error: str = ""
base_url: str = ""
server_handled_request: threading.Event
@pytest.fixture(scope="module")
def http_server() -> HttpServerControl:
import threading
from http.server import BaseHTTPRequestHandler, HTTPServer, HTTPStatus, _get_best_family
# The code in this function was adapted from http.server.test()
protocol = "HTTP/1.1" # HTTP 1.1 supports automatic Keep-Alive
port = 0 # Use port 0 so a free random port is allocated
bind = "127.0.0.1" # Bind to localhost so nothing else can connect.
class LocalHttpServer(HTTPServer):
thread = None
socket = None
def shutdown_request(self, request):
super().shutdown_request(request)
server_ctl.server_handled_request.set()
def shutdown(self, *_):
logger.info("HTTPServer shutdown")
if self.thread:
super().shutdown()
self.thread.join()
self.socket = self.thread = None
def start(self):
assert not self.thread
self.thread = threading.Thread(target=httpd.serve_forever)
self.thread.start()
def __enter__(self):
return self
__del__ = __exit__ = shutdown
class RequestHandler(BaseHTTPRequestHandler):
protocol_version = protocol
def do_GET(self):
self.log_message("GET %s HTTP/%s", self.path, self.protocol_version)
force_error, server_ctl.force_error = server_ctl.force_error, ""
if force_error:
self.close_connection = True
self.log_message("Mark close connection: forced %s [%s]", force_error, self.path)
return
contents = b"{}"
try:
self.log_message("GET OK %s [start response]", self.path)
self.send_response(HTTPStatus.OK)
self.send_header("Content-type", "application/json")
self.send_header("Content-Length", str(len(contents)))
self.end_headers()
self.wfile.write(contents)
self.log_message("GET OK %s [end response]", self.path)
except OSError as e: # Host stops reading too early.
self.log_message("GET OK %s [failed response]\n%s", self.path, str(e))
def log_message(self, msg, *args):
logger.info(f"[http.server] {msg}", *args)
def log_error(self, msg, *args):
logger.error(f"[http.server] {msg}", *args)
server_ctl = HttpServerControl()
server_ctl.server_handled_request = threading.Event()
LocalHttpServer.address_family, addr = _get_best_family(bind, port)
with LocalHttpServer(addr, RequestHandler) as httpd:
host, port = httpd.socket.getsockname()[:2]
url_host = f"[{host}]" if ":" in host else host
server_ctl.base_url = f"http://{url_host}:{port}/"
logger.info("HTTPServer at %s", server_ctl.base_url)
httpd.start()
yield server_ctl
def http_get(http_server, session, param):
http_server.server_handled_request.clear()
logger.info("[client] GET %s/param?%s", http_server.base_url, param)
with session.get(http_server.base_url, params=dict(param=param)) as response:
response.raise_for_status()
assert len(response.text) == 2 # read everything
logger.info("[client] %s %r", response.status_code, response.text)
http_server.server_handled_request.wait()
def test_http_connection_errors(http_server: HttpServerControl, monkeypatch):
logger_urllib = logging.getLogger("urllib3")
monkeypatch.setattr(logger_urllib, "level", logging.DEBUG)
http_server.force_error = "abort_response"
session = requests.Session()
max_retries = requests.adapters.Retry(total=2, connect=1, read=1)
adapter = requests.adapters.HTTPAdapter(pool_connections=1, pool_maxsize=1, max_retries=max_retries)
session.mount("http://", adapter)
with session:
http_get(http_server, session, "A")
Environment
Describe your environment.
At least, paste here the output of:
OS Windows-10-10.0.19044-SP0
Python 3.10.11
OpenSSL 1.1.1t 7 Feb 2023
urllib3 2.0.3
The text was updated successfully, but these errors were encountered:
joaoe
changed the title
Cyclic references with exceptions cause pools not to be destroyed
Exceptions with stacktraces kept alive for long cause pools not to be destroyed
Jun 30, 2023
Subject
Hi.
This is a tricky subject to understand, but it is a regression introducedy by
a80c248
present in urllib3>=1.26.16
That commit causes poolmanager to clear its state (namely connections) based on the whims of the garbage collector.
However, there is a pretty simple scenario where references to poolmanager can be kept alive for too long and code that would reliably close all connections and clear all state no longer works.
I happened to stumble upon this with a test file which tests for error handling using the requests library which internally uses urllib3.
Basically I trigger errors inside the urlib3 code by using a local mock http server which misbehaves.
The urllib3 code then gets a
RemoteDisconnected()
exception fromhttp/client.py", line 287, in _read_status
.The exception includes the traceback together with outer stack frames. The traceback keeps all objects in the entire call chain pinned in memory.
Then this exception is saved inside a Retries object in
connectionpool.py
, hereAnother location is in urllib3/util/retry.py where a retry object is created with the history of failed requests, which also saves exceptions
Basically, any exception being triggered causes this problem.
To put the cherry on top, these messages are logged, and when using pytest (or another testing framework), it happily collects all log messages (LogRecord objects) for reporting after the test finishes, which in turn keep the retry objects alive inside
LogRecord.args
. So during a unit test, nothing gets garbage collected.Actual Behavior
So below is a smaller version of the test I had which gets stuck indefinetly. It gets stuck in the
HTTPServer.shutdown()
call because it waits for the serve_forever loop to quit and set a threading.Event. However, that loop is stuck because the socket and connection are still open, because the session and connections have not been garbage collected.Expected Behavior
Although exceptions can be saved, their tracebacks should be cleared. Every location where an exception is saves as property of some object, the stacktrace should be cleared with
ex.with_traceback(None)
.Steps to Reproduce
This test hangs.
Environment
Describe your environment.
At least, paste here the output of:
The text was updated successfully, but these errors were encountered: