-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mutex savenexus calls to prevent segfaults #544
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## next #544 +/- ##
=======================================
Coverage 96.02% 96.02%
=======================================
Files 71 71
Lines 5463 5463
=======================================
Hits 5246 5246
Misses 217 217 ☔ View full report in Codecov by Sentry. |
add back the sleeps remove algo from manager when complete added timeout in case of long async algo queue reorg where we check and wait for algos to complete, share a lock between potential cross interference both reentrant and nonconcurrent mutexes
e155126
to
de1412c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm running the tests now, but you might as well fix the spelling error. That is the only comment that I had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified that the provided cis-test script definitely produces a SEGFAULT on "next". (The script needs to be run in workbench at the same time as the reduction workflow is running in SNAPRed.) The SEGFAULT occurs during the final SaveNexus write of the reduced output workspaces.
On this branch however, the same test completes successfully without any SEGFAULT. I can definitely notice a slowdown in certain sections, usually where a RenameWorkspace
operation is waiting on one of theSaveNexus
calls. This behavior is reassuring as this indicates that the work-around is clearly being applied.
Description of work
This adds a mutex for SaveNexus in MantidSnapper to mitigate the race condition we are experiencing
To test
NOTE: Test script points to data in MY directory, update paths before you run!
Attempt to reduce
64486
, default calibration and artificial normalization is fine.run cis script in pr at the same time.
Repeat till self assured.
Observe no segfault.
Link to EWM item
EWM#8534
Verification
Acceptance Criteria
This list is for ease of reference, and does not replace reading the EWM story as part of the review. Verify this list matches the EWM story before reviewing.