-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What should happen when an FD is closed while a thread is blocked on it? #4051
Comments
POSIX says nothing on this. In theory it counts as a race condition, as you can't ever say another thread has blocked on this FD already (there's no general way to ensure that has happened yet, it may be right before the thread actually executes the blocking op). In practice, once actually blocked, the thread just never wakes. So... beyond going to reporting a race condition error instead of an unsupported error, I see no good course of action. We picked unsupported as it's unspecified what the behaviour is, and in practice it's definitely not "return the same result as if the FD were closed before the read/write call was done" |
I can't parse this, what do you mean?
That would mean making it UB. The question is, is it UB? |
@the8472 do you know if there is any required behavior from POSIX here? "Never wake up the thread" does seem like a reasonable implementation we could copy. Maybe we should issue some sort of warning though? We don't have much in the way of warnings in Miri currently, but after all this is almost certainly a bug. |
When thread A is running on a code path that will inevitably block on FD, and thread B is closing FD, we are in an unclear state wrt what happens to thread A. Depending on the scheduler it
It's unspecified if the third option can happen. The first option leads to the thread staying blocked forever in practice (this may also depend on the type of FD, pipes seem in theory more likely to unblock than e.g. a file) The second option is well specified |
I decided we should pick unsupported as this is always a bug according to all stackoverflow and other forum answers I could find. These often state that it's a race condition, but without any references. On a more high level, a FD should only be closed if there are no outstanding uses of it. The docs of close say the word "race condition"
|
"unsupported" indicates a missing feature in Miri, not a bug in the program. The error text even explicitly says that. So according to what you said we should emit a different error.
|
The epoll man page says the following:
In other words the fd will be removed from the interest list if there is no dup of the fd still around. If it is the last fd in the interest list, epoll_wait will block until another thread inserts a new fd in the interest list and this new fd triggers an event I think, just like when calling epoll_wait when the interest list was empty from the start. |
The epoll case I was talking about above is about what happens when the epoll FD itself is closed while someone blocks on it. What happens when an FD in the interest list is closed, is a separate question, but I think that is already handled properly. epoll identifies interest by the pair of FD number and underlying file description object; if the object is closed then even if the number gets reused the interest will never be triggered. |
@tiif do we have a test for this situation of "the FD registered with epoll got closed but there's a duplicate FD for the same file description so events are still reported"? I guess that means events are reported with the old FD number, which is actually quite confusing since it won't refer to the relevant object any more... |
epoll is a linux-specific api. poll is the posix standard one, but I haven't seen any explicit documentation whether a wakeup will be triggered by a concurrent close or not.
File descriptions are refcounted resources (on linux). So I assume that closing the file descriptor will only decrease the count by one but the ongoing polling syscall will own another token and so the underlying description will remain open at least until the syscall returns. Since epollfd is linux-specific the portability of those semantics shouldn't be relevant here. |
I really don't want to replicate per-FD-kind behaviour on close while blocked, with platform dependent shenanigans. It's why I opted for the unsupported, since we do not support the (potentially guaranteed) platform or FD specific behaviour. The situations are always an edge case that you probably don't want to encounter anyway. We could add a new |
I don't think we should have per-FD-type logic, either. We could raise a "deadlock" error, since it does mean that this thread will definitely never be woken up.
So that would mean even if the epoll FD is closed, future wakeups on threads waiting on that FD still happen, since the underlying data is still present. Makes sense. We should probably implement that, then. |
yes miri/tests/pass-dep/libc/libc-epoll-no-blocking.rs Lines 339 to 383 in 158973f
|
If we plan to support "closing fd while blocking on it" operation for epoll, should we do the same for |
Those are different situations though -- one is blocking on read/write, the other is blocking on a dedicated operation ( |
I expect it'd be similar for |
Yeah but we are not talking about a closed peer fd here.
|
The blocked fd could be closed too, the case that I am thinking of is:
which of the following should happen here?
|
Since in practice the thread will not be unblocked, we should not do 2. So my proposal would be to throw a "deadlocked" error either when the FD is closed with a non-empty wait list, or when the thread wakes up and finds that its FD is gone.
|
Sure, this makes sense. It would be nice if we could throw a more specific error than deadlock if we encounter this scenario. The reason is if someone manages to make this work on a certain platform in a real system, but it fails in miri with a deadlock error, it'd be helpful to know that it encountered this specific scenario. This way, we can rule out the possibility of it being a miri implementation bug early on. |
Yeah ideally the error would point at both the "close", and the thread that can now no longer wake up.
And then for epoll - we should probably make the Weak ref into an Rc so that the unblock callback can definitely access the epoll instance?
|
Yea, and perhaps we need to close it again to properly destruct it from the global machine state after we finish using it. |
For both epoll and eventfd, the question arises what happens when an FD is closed while a thread is blocked on it. Currently, we throw an "unsupported" error. That's not a great solution -- we should figure out what POSIX guarantees in that case, and implement that.
So, what does POSIX say?
Cc @tiif
The text was updated successfully, but these errors were encountered: