Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PredictionThreshold error drops requests missing correction #75

Open
MaxCWhitehead opened this issue Apr 10, 2024 · 3 comments · May be fixed by #77
Open

PredictionThreshold error drops requests missing correction #75

MaxCWhitehead opened this issue Apr 10, 2024 · 3 comments · May be fixed by #77
Labels
bug Something isn't working

Comments

@MaxCWhitehead
Copy link
Contributor

MaxCWhitehead commented Apr 10, 2024

Describe the bug

When client receives input and flags a first_incorrect frame, adjust_gamestate pushes load + advance requests to perform rollback. it then calls self.sync_layer.reset_prediction() which resets tracked first_incorrect frame.

In advance_frame when applying local input, if PredictionThreshold error is returned, function exits with error and drops the requests for the correction. Due to first incorrect frame being reset, this correction is missed and causes desync.

I'm still reasoning about exactly where the issue is / what the solution is. I believe the example games are vulnerable to this bug too.

Additional Context

Here is example of logs on a client that missed the correction (3 client game. One client missed correction for player 1's input, the 2nd applied it correctly, now players 0 and 2 are desynced). (the frame listed in input log is predicted frame, in case that is confusing). Might be more confusing than helpful, but demonstrates that both clients were notified of player 1's input, ggrs pushed requests for corrections, but the one that hit prediction threshold did not rollback, causing desync.

Player 0's logs (Missed correction on player 1's input due to prediction threshold)

  • We confirm frame 500 without rollback applied. (No log here for bones performing rollback like in second block, only prediction error)
2024-04-10T02:42:31.164609Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 504 input: DensePlayerControl { .0: 992, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(1.0, 0.0)) }
2024-04-10T02:42:31.180776Z  WARN ggrs::input_queue: Setting first incorrect frame: 500
2024-04-10T02:42:31.180807Z  WARN ggrs::sessions::p2p_session: Requesting rollback to frame: 500
2024-04-10T02:42:31.180818Z  WARN ggrs::sessions::p2p_session: Set last confirmed frame: 498
2024-04-10T02:42:31.180823Z  WARN ggrs::sync_layer: Prediction threshold error
2024-04-10T02:42:31.180829Z  WARN bones_framework::networking: Freezing game while waiting for network to catch-up.
2024-04-10T02:42:31.214065Z  WARN ggrs::sessions::p2p_session: Set last confirmed frame: 500
2024-04-10T02:42:31.216185Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 505 input: DensePlayerControl { .0: 0, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(0.0, 0.0)) }

Player 2's logs (correction applied correctly on player 1's input):

  • Rollback in game is logged, and player 1's next input is confirmed.
2024-04-10T02:42:31.182435Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 503 input: DensePlayerControl { .0: 992, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(1.0, 0.0)) }
2024-04-10T02:42:31.189945Z  WARN ggrs::input_queue: Setting first incorrect frame: 500
2024-04-10T02:42:31.189965Z  WARN ggrs::sessions::p2p_session: Requesting rollback to frame: 500
2024-04-10T02:42:31.189974Z  WARN ggrs::sessions::p2p_session: Set last confirmed frame: 499
2024-04-10T02:42:31.190980Z DEBUG bones_framework::networking: Loading (rollback) frame: 500
2024-04-10T02:42:31.190999Z DEBUG bones_framework::networking: Net player(1) local: false, status: Confirmed, frame: 504 input: DensePlayerControl { .0: 0, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(0.0, 0.0)) }

To Reproduce

I can repro in jumpy pretty easily / can provide steps - but I will try to write a test to repro this for testing + preventing regression once figure out what to do here.

One thing that helps repro is having 4 clients open at once (with a relay server so not p2p locally), and having 200+ ping, lots of prediction threshold errors :) (Silly way to say that poor conditions definitely bring this to light, possibly high ping + lower prediction window might do it).

@MaxCWhitehead MaxCWhitehead added the bug Something isn't working label Apr 10, 2024
@MaxCWhitehead
Copy link
Contributor Author

Haven't gotten to exploring what kind of fix might make the most sense, but I wrote a test that reproduces this.

I implemented a fake DebugSocket mechanism that allows the test implementation to control when messages are actually delivered between clients, to help reproducibly enter a state in which a correction happens at same time as prediction threshold error.

The simplest form I have found to repro this is with 3 clients.

  • Client A changes its input, but B/C do not receive it yet.
  • Client C gets A's new input and triggers rollback, but has not yet received B's input, so also gets prediction threshold error.
  • Test fails because C does not get requests for rollback due to error, and misses the correction.
  • Then tick a couple more normal frames to get desync detection messages exchanged, and then test fails due to desync.

Here's the test for reference: main...MaxCWhitehead:ggrs:prediction-error-rollback-test

Will look into how might fix / test against this.

@gschup
Copy link
Owner

gschup commented Jun 1, 2024

Hi! thanks for posting this bug report. Sorry for not responding so far. I have just posted a PR that slightly alters the logic for handling inputs. I am not 100% sure this fixes this issue, but it might. Would you be able to test this again on the lockstep branch?
-> #79

@gschup
Copy link
Owner

gschup commented Jun 1, 2024

#70 has additional information on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants