Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceptions with stacktraces kept alive for long cause pools not to be destroyed #3082

Open
joaoe opened this issue Jun 29, 2023 · 0 comments

Comments

@joaoe
Copy link

joaoe commented Jun 29, 2023

Subject

Hi.
This is a tricky subject to understand, but it is a regression introducedy by
a80c248
present in urllib3>=1.26.16

That commit causes poolmanager to clear its state (namely connections) based on the whims of the garbage collector.
However, there is a pretty simple scenario where references to poolmanager can be kept alive for too long and code that would reliably close all connections and clear all state no longer works.

I happened to stumble upon this with a test file which tests for error handling using the requests library which internally uses urllib3.
Basically I trigger errors inside the urlib3 code by using a local mock http server which misbehaves.
The urllib3 code then gets a RemoteDisconnected() exception from http/client.py", line 287, in _read_status.
The exception includes the traceback together with outer stack frames. The traceback keeps all objects in the entire call chain pinned in memory.

Then this exception is saved inside a Retries object in connectionpool.py, here

            retries = retries.increment(
                method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
            )

Another location is in urllib3/util/retry.py where a retry object is created with the history of failed requests, which also saves exceptions

        new_retry = self.new(
            [...]
            history=history,
        )

Basically, any exception being triggered causes this problem.

To put the cherry on top, these messages are logged, and when using pytest (or another testing framework), it happily collects all log messages (LogRecord objects) for reporting after the test finishes, which in turn keep the retry objects alive inside LogRecord.args. So during a unit test, nothing gets garbage collected.

Actual Behavior

So below is a smaller version of the test I had which gets stuck indefinetly. It gets stuck in the HTTPServer.shutdown() call because it waits for the serve_forever loop to quit and set a threading.Event. However, that loop is stuck because the socket and connection are still open, because the session and connections have not been garbage collected.

Expected Behavior

Although exceptions can be saved, their tracebacks should be cleared. Every location where an exception is saves as property of some object, the stacktrace should be cleared with ex.with_traceback(None).

Steps to Reproduce

This test hangs.

import logging
import threading

import pytest
import requests.adapters

logger = logging.getLogger(__name__)


class HttpServerControl:
    force_error: str = ""
    base_url: str = ""
    server_handled_request: threading.Event


@pytest.fixture(scope="module")
def http_server() -> HttpServerControl:
    import threading
    from http.server import BaseHTTPRequestHandler, HTTPServer, HTTPStatus, _get_best_family

    # The code in this function was adapted from http.server.test()

    protocol = "HTTP/1.1"  # HTTP 1.1 supports automatic Keep-Alive
    port = 0  # Use port 0 so a free random port is allocated
    bind = "127.0.0.1"  # Bind to localhost so nothing else can connect.

    class LocalHttpServer(HTTPServer):
        thread = None
        socket = None

        def shutdown_request(self, request):
            super().shutdown_request(request)
            server_ctl.server_handled_request.set()

        def shutdown(self, *_):
            logger.info("HTTPServer shutdown")
            if self.thread:
                super().shutdown()
                self.thread.join()
            self.socket = self.thread = None

        def start(self):
            assert not self.thread
            self.thread = threading.Thread(target=httpd.serve_forever)
            self.thread.start()

        def __enter__(self):
            return self

        __del__ = __exit__ = shutdown

    class RequestHandler(BaseHTTPRequestHandler):
        protocol_version = protocol

        def do_GET(self):
            self.log_message("GET %s HTTP/%s", self.path, self.protocol_version)
            force_error, server_ctl.force_error = server_ctl.force_error, ""
            if force_error:
                self.close_connection = True
                self.log_message("Mark close connection: forced %s [%s]", force_error, self.path)
                return

            contents = b"{}"
            try:
                self.log_message("GET OK %s [start response]", self.path)
                self.send_response(HTTPStatus.OK)
                self.send_header("Content-type", "application/json")
                self.send_header("Content-Length", str(len(contents)))
                self.end_headers()
                self.wfile.write(contents)
                self.log_message("GET OK %s [end response]", self.path)

            except OSError as e:  # Host stops reading too early.
                self.log_message("GET OK %s [failed response]\n%s", self.path, str(e))

        def log_message(self, msg, *args):
            logger.info(f"[http.server] {msg}", *args)

        def log_error(self, msg, *args):
            logger.error(f"[http.server] {msg}", *args)

    server_ctl = HttpServerControl()
    server_ctl.server_handled_request = threading.Event()

    LocalHttpServer.address_family, addr = _get_best_family(bind, port)
    with LocalHttpServer(addr, RequestHandler) as httpd:
        host, port = httpd.socket.getsockname()[:2]
        url_host = f"[{host}]" if ":" in host else host
        server_ctl.base_url = f"http://{url_host}:{port}/"
        logger.info("HTTPServer at %s", server_ctl.base_url)
        httpd.start()

        yield server_ctl


def http_get(http_server, session, param):
    http_server.server_handled_request.clear()

    logger.info("[client] GET %s/param?%s", http_server.base_url, param)
    with session.get(http_server.base_url, params=dict(param=param)) as response:
        response.raise_for_status()
        assert len(response.text) == 2  # read everything
        logger.info("[client] %s %r", response.status_code, response.text)

    http_server.server_handled_request.wait()


def test_http_connection_errors(http_server: HttpServerControl, monkeypatch):
    logger_urllib = logging.getLogger("urllib3")
    monkeypatch.setattr(logger_urllib, "level", logging.DEBUG)

    http_server.force_error = "abort_response"

    session = requests.Session()
    max_retries = requests.adapters.Retry(total=2, connect=1, read=1)
    adapter = requests.adapters.HTTPAdapter(pool_connections=1, pool_maxsize=1, max_retries=max_retries)
    session.mount("http://", adapter)

    with session:
        http_get(http_server, session, "A")

Environment

Describe your environment.
At least, paste here the output of:

OS Windows-10-10.0.19044-SP0
Python 3.10.11
OpenSSL 1.1.1t  7 Feb 2023
urllib3 2.0.3
@joaoe joaoe changed the title Cyclic references with exceptions cause pools not to be destroyed Exceptions with stacktraces kept alive for long cause pools not to be destroyed Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant