Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipc_*: The (very few) blocking APIs' behavior on signal = inconsistent, some uninterruptible. #88

Open
ygoldfeld opened this issue Mar 12, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@ygoldfeld
Copy link
Contributor

ygoldfeld commented Mar 12, 2024

In most of Flow-API, APIs are non-blocking. There are various key async APIs, but that's not blocking. There are however a handful of potentially-blocking APIs. Ideally these should behave in some reasonable way on signal, so that at least it would be possible to interrupt a program on SIGTERM while it is blocking inside such an API. In POSIX generally blocking I/O functions will emit EINTR in this situation. We should do something analogous; or failing that at least something predictable and consistent; or failing that present mitigations (but in that case not resolve this ticket).

Here are the existing blocking APIs as of this writing.

  • Concept: Persistent_mq_handle::[timed_]send(), [timed_]receive(), [timed_]wait_sendable(), [timed_]wait_receivable(). Impls:
    • Posix_mq_handle::*()
      • Internally this epoll_wait()s. Signal => EINTR; so we will also emit Error_code around EINTR.
      • Assessment: Pretty good, in that it is interruptible with minimum fuss.
    • Bipc_mq_handle::*()
      • Internally this (via some Boost code, long story) pthread_cond_[timed]wait()s. How this will act on signal is unclear; as of this writing I have not checked or do not remember also what the Boost code does in wrapping the pthread_ call. But the call itself is documented to yield no EINTR ever in some man pages, and EINTR only from pthread_cond_timedwait() (not the one without timed_ though) in some other man pages.
      • Assessment: Unclear, but not good. It will almost certainly not work in some cases and will just keep waiting.
  • struc::Channel::sync_request() (+ future likely similar API(s) including sync_expect_msg())
    • Internally this uses promise::get_future().wait[_for]() which almost certainly internally in turn uses pthread_cond_[timed]wait().
    • Assessment: Probably similar to Bipc_mq_handle situation but would need to be verified.
  • Nota bene: sync_connect() (in socket-stream and client-session) are today formally non-blocking, because it's non-networked IPC; but there are plans to make it networked-IPC and those guys in that case will become potentially-blocking. At that point we could leverage built-in EINTR probably; details TBD. Just keep it in mind.

Ideally all three -- and any future blocking impls -- should get interrupted if-and-only-if a built-in blocking call such as ::read() or ::poll() would be interrupted, on signal. How they're interrupted is less important: whether it's EINTR or another Error_code, as long as it's documented (that said perhaps it's best to standardize around an error::Code of ours we'd add for this purpose).

I'll forego explaining how to impl this. The Posix_mq_handle guy is already there; but the other two are a different story. (For Bipc_mq_handle one could somehow leverage the already-existing Persistent_mq_handle::interrupt_*() APIs. But, then another thread would be necessary. Food for thought at any rate.) (For Channel, not sure, but we should be able to swing something; maybe an intermediary thread; signal => satisfy the promise.)

Lastly, but very much not leastly (sorry), users should be aware of the following mitigating facts, until this ticket is resolved:

  • Persistent_mq_handle:
    • This concept is not really primarily meant for direct use. It can be -- it is nicer than using the POSIX MQ or bipc::message_queue APIs directly, and it adds features including interruptibility to the latter -- but that's a bonus really; these exist to implement Blob_stream_mq_sender/receiver higher-level APIs, and those work just fine (there is no signal-interrupt issue): they (internally) avoid blocking calls except .wait_able(), and the latter is .interrupt_()ed if needed (namely from dtor).
    • If you do use it:
      • interrupt_*() is available to you. So if it is necessary to interrupt a wait or blocking-send/receive, then you can use a separate thread to run it in and invoke interrupt_*() from a signal handler. This isn't no-fuss, but it does work if needed at least.
      • With Posix_mq_handle, you do get the nice normal EINTR behavior already.
      • timed_*() are also available: You could use those with a short, but not-too-short, timeout -- in which case a signal can affect them but only after a bit of lag; could be fine. This is the usual approach... not perfect but doable and pretty simple.
  • struc::Channel::sync_request():
    • This guy is only blocking if locally there is no expect_msg[s]() pending on the opposing side (for future speculated sync_expect_msg() API: if there is no already-received-and-queued-inside in-message of the proper type, or one isn't forthcoming immediately).
    • But in many -- most -- cases protocols are set up in such a way as to in fact make sync_request() non-blocking: It is expected the opposing side has set-up their expect_msg[s]() by the time we issue the sync_request(). That's just good for responsive IPC; and it's nice not to need an async call like async_request(), where a non-blocking one would do much more simply.
      • For example we use sync_request() internally in other parts of Flow-IPC this way routinely.
@ygoldfeld ygoldfeld added the bug Something isn't working label Mar 12, 2024
@ygoldfeld ygoldfeld changed the title ipc_*: Tie up loose ends regarding the (very few) blocking APIs' behavior on signal. ipc_*: The (very few) blocking APIs' behavior on signal = inconsistent, some uninterruptible. Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant