CNDB-14460: Fix Nodes test flakiness resulting from unsafe interleaving of async operations in test scenarios #1812
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the issue
The singleton Nodes instance sequences operations that should not overlap by running them on a single-threaded executor. In some cases, these operations are executed in a synchronous manner, where the caller waits on the future. In other cases, they're executed asynchronously by queuing. In some tests, the singleton Nodes instance is shut down and replaced in an unsafe manner, to test cases where a node is restarted. This shut down does not terminate or wait on the executor, as the asynchronous tasks can safely be recovered on node restart. In the tests, however, these asynchronous operations can interleave with the newly created Nodes instance such that the operations no longer have the expected isolation, resulting in test failures.
Async operations can also interleave with the temporary directories backing a Nodes instance being deleted by Junit.
What does this PR fix and why was it fixed
When unsafely replacing the singleton Nodes instance in tests, trigger a shutdown on the executors and await inflight tasks.
When shutting down at the end of a test, await inflight tasks.