Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More tests for picos.sync #83

Open
polytypic opened this issue Apr 8, 2024 · 2 comments
Open

More tests for picos.sync #83

polytypic opened this issue Apr 8, 2024 · 2 comments

Comments

@polytypic
Copy link
Collaborator

polytypic commented Apr 8, 2024

I may have observed the picos.sync tests potentially dead/livelocking at least on (32-bit) OCaml 4.14 on CI. This might indicate a bug in the picos.sync library, a bug in the test, a bug in (32-bit) OCaml 4.14 (I don't recall seeing the test not completing on other OCaml versions, but I might have simply missed that), or it might be a completely unrelated thing (test machine being slow for some other reason). At any rate, this needs to be investigated further and the correctness of the picos.sync library implementation ensured.

Observations:

  • debian-12-4.14_arm32_opam-2.1 (not completed after 24+ minutes, completed very quickly after cancel+rebuild)
  • If thread-local-storage is (for some reason) not installed, it was possible, before When threads.posix exists we really need thread-local-storage #110, to build a non-working set of libraries where the mutex cancelation test and benchmarks did not terminate. This shouldn't really be the case with the observed non-completion.
  • Tried running the picos_sync test repeatedly in parallel (dozen or so) with OCaml 4.14.2 on macOS with M1. Did not get any lockups within a few hours.
  • debian-12-4.14_arm64_opam-2.1(not completed in an hour)
  • debian-12-4.14_opam-2.2 (seemed to be stuck in the cancelation test)
@polytypic polytypic changed the title More tests picos.sync More tests for picos.sync Apr 8, 2024
@polytypic
Copy link
Collaborator Author

polytypic commented Aug 19, 2024

It might be that the issue was related to the cancelation test spawning fibers, which translate to systhreads on OCaml 4. PR #230 changes the tests to not spawn fibers. Time will tell whether this eliminates the hangs on OCaml 4.

Addition: There was a test run where the cancelation test did not seem to complete on 4.14 arm64. Not spawning lots of systhreads seems to have made the failures less common.

@polytypic
Copy link
Collaborator Author

@edwintorok mentioned about the pthread_cond_wait bug, which might be the cause of the issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant