Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceptions not handled in server thread #983

Open
andy-maier opened this issue Nov 28, 2023 · 1 comment
Open

Exceptions not handled in server thread #983

andy-maier opened this issue Nov 28, 2023 · 1 comment

Comments

@andy-maier
Copy link
Contributor

andy-maier commented Nov 28, 2023

We are occasionally getting an ssl.SSLEOFError when using the prometheus client with HTTPS (as of the new version 0.19.0). That error is raised in the server thread and because it is not handled there, the entire process dies:

Traceback (most recent call last):
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 138, in run
    self.finish_response()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 184, in finish_response
    self.write(data)
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 288, in write
    self.send_headers()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 346, in send_headers
    self.send_preamble()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 268, in send_preamble
    self._write(
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 467, in _write
    result = self.stdout.write(data)
  File "/usr/lib/python3.10/socketserver.py", line 826, in write
    self._sock.sendall(b)
  File "/usr/lib/python3.10/ssl.py", line 1237, in sendall
    v = self.send(byte_view[count:])
  File "/usr/lib/python3.10/ssl.py", line 1206, in send
    return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2426)
Traceback (most recent call last):
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 138, in run
    self.finish_response()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 184, in finish_response
    self.write(data)
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 288, in write
    self.send_headers()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 346, in send_headers
    self.send_preamble()
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 268, in send_preamble
    self._write(
  File "/usr/lib/python3.10/wsgiref/handlers.py", line 467, in _write
    result = self.stdout.write(data)
  File "/usr/lib/python3.10/socketserver.py", line 826, in write
    self._sock.sendall(b)
  File "/usr/lib/python3.10/ssl.py", line 1237, in sendall
    v = self.send(byte_view[count:])
  File "/usr/lib/python3.10/ssl.py", line 1206, in send
    return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2426)

I verified that this exception is not raised within start_http_server() so it must be happening in the server thread.

The goal would be that the exception raised in the server thread causes the server thread to terminate and the main thread has some means to get to the exception raised in the server thread. The solution would need to work for both HTTP and HTTPS.

There seem to be different techniques on how that can be achieved:

This would probably require that the start_http_server() function returns the thread object (see issue #883)

@andy-maier
Copy link
Contributor Author

andy-maier commented Nov 28, 2023

I did some more investigation and found that these exceptions are actually handled and just printed in a way (with traceback) that makes one think they had not been handled.

They are printed in the wsgiref.BaseHandler.handle_error() method in the request handler threads run by the WSGIServer (see https://github.com/python/cpython/blob/main/Lib/wsgiref/handlers.py#L382), and that method sends a response.

In fact, I found that at some level (below the prometheus-client) a retry must happen, because the exporter happily responds to a web browser when I inject the exception into socketserver._SocketWriter.write(). See zhmcclient/zhmc-prometheus-exporter#439 for more details.

So I guess the questions are:

  • Are we sure that such exceptions are fully recovered?
  • Can something be done to not confuse the users who may think this is an unhandled exception?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant