Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows: Fix server hangs under some circumstance #674

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

JunkuiZhang
Copy link

@JunkuiZhang JunkuiZhang commented Feb 28, 2025

Hi, thank you so much for creating such a high-quality Rust library.

We encountered a tricky issue while using this library in the development of Zed (https://github.com/zed-industries/zed). After testing, we identified the issue as a bug in notify. This PR provides a test case, test_windows_trash_dir(), which reproduces the bug.

The bug can be reproduced as follows: in a project directory /path/to/project, there is a subfolder /path/to/project/child. If we monitor changes in the child folder, then move the child folder to the trash (deleting the folder directly does not trigger the bug), and monitor the project directory, the server gets stuck at the WaitForSingleObjectEx command.

This issue seems to be related to the bAlertable flag in WaitForSingleObjectEx, but we are still unclear why setting bAlertable to TRUE causes this bug.

@JunkuiZhang
Copy link
Author

Moving a folder to the Recycle Bin or deleting a folder triggers ERROR_ACCESS_DENIED, but this error wasn't handled properly, leading to an infinite loop. Here’s what happens in detail:

  1. When dir_a is being watched, the server calls ReadDirectoryChangesW to monitor changes and registers handle_event as the completion routine.

  2. After dir_a is moved to the Recycle Bin, a change is detected, triggering handle_event. At this point, error_code is ERROR_ACCESS_DENIED.

  3. The original code does not handle this error, so execution continues, calling start_read, which in turn calls ReadDirectoryChangesW again.

  4. Since dir_a no longer exists, ReadDirectoryChangesW immediately returns 1, which is interpreted as a successful call. This causes handle_event to be registered again.

  5. Windows then calls handle_event again, with ERROR_ACCESS_DENIED, repeating the cycle indefinitely:

    handle_event(ERROR_ACCESS_DENIED) -> start_read(ReadDirectoryChangesW, return 1) -> handle_event(ERROR_ACCESS_DENIED) -> start_read(ReadDirectoryChangesW, return 1)...
    

Interestingly, when the folder is deleted instead of moved to the Recycle Bin, the second ReadDirectoryChangesW call returns 0, indicating failure, which prevents the loop. Additionally, setting bAlertable = FALSE stops the completion routine from being called, avoiding the issue entirely.

Changes in this PR:

  • Handle ERROR_ACCESS_DENIED: If the folder no longer exists, it has either been deleted or moved to the Recycle Bin, so we unwatch it.
  • Improve error handling: Known error codes are handled appropriately. For unknown error codes, start_read is not called to prevent potential infinite loops in the future.

@JunkuiZhang JunkuiZhang marked this pull request as ready for review March 7, 2025 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant