-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RCORE-2063 Fix flakey 'Test client migration and rollback with recovery' test #7542
base: master
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build michael.wilkersonbarker_1058Details
💛 - Coveralls |
Don't we want to fix the real issue too (clearing tracking the client reset in case of a rollback)? |
oh, yes - the test was originally rolling back during client reset on purpose to see what the issues are and the client reset tracking entry was being left over, causing the original failure. |
…ing if action type changes
Added client reset error and action tracking to the client reset metadata storage (producing v2) and if the new client reset action is different from the current client reset action, then the older client reset tracking will be removed in favor of the new client reset. Also updated the "Test client migration and rollback with recovery" test to reflect this operation:
|
Can we break this PR up into the three separate PRs? 1) adding the extra info to the client reset tracker to show what the original error was 2) changing it so that if a new client reset of a different type starts it will be allowed to continue 3) any changes to fix up the test ? |
Sure - I can do that, although it may be better to create two PRs - one for the additions and another that allows a different type to continue + updates to migration test |
What, How & Why?
Updated the client reset tracking to store the original client reset error and action. If a new client reset occurs with a different action (e.g. rolled back to PBS), the original client reset tracking info (e.g. for migrated to FLX) will be removed and the new client reset will be allowed to continue.
Added extra sync client hook events to capture different steps along the way during a client reset - this allows the "Test client migration and rollback with recovery" test to pause the client reset and roll back to PBS while the FLX migration client reset is in progress, effective reproducing the condition that was intermittently failing in the past. This test was previously taking 90+ secs to complete and about a third of this time was waiting for the reconnect timer to expire for the sync session that was active during migration and rollback. Added
handle_reconnect()
call after migration/rollback to cancel this timer, shaving around 30 secs off this test.Fixes #7539, #6154
☑️ ToDos
[ ] C-API, if public C++ API changed[ ]bindgen/spec.yml
, if public C++ API changed