Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add unit test of syncing large streams #3205

Closed
wants to merge 11 commits into from
Closed

Conversation

stbrody
Copy link
Contributor

@stbrody stbrody commented Apr 11, 2024

It's a little hard to tell exactly what the root cause of the failure here is and whether or not it's actually IOD. But this test passes consistently if the NUM_STREAMS and NUM_EVENTS_PER_STREAMS constants are set to low values like 10, but fails pretty regularly when set to 101. It also seems to pass more often when set to 100 than to 101, I'm theorizing that an odd (and prime) number makes the recon conversation more likely to return things out of order compared to an even number. So I think there are reasonable enough odds that this failing due to IOD issues. I guess we'll have to wait until IOD ships and see if this test starts passing to be sure.

@stbrody stbrody self-assigned this Apr 11, 2024
@stbrody stbrody marked this pull request as ready for review April 12, 2024 14:47
@stbrody stbrody marked this pull request as draft April 12, 2024 14:50
@stbrody stbrody marked this pull request as ready for review July 15, 2024 19:58
Copy link
Contributor

@dav1do dav1do left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@stbrody
Copy link
Contributor Author

stbrody commented Jul 16, 2024

I'm not really sure what changed but this test seems to be passing in CI now

@dav1do
Copy link
Contributor

dav1do commented Jul 16, 2024

I'm not really sure what changed but this test seems to be passing in CI now

That is odd.. I don't really know either. The changes to default to rust-ceramic did modify how some of the binary configuration/docker stuff is set up, which may have helped but it's odd. And the new releases we've done mean we should definitely be getting a version of c1 that has the fixes needed to support this (though that should have happened a while ago).

@stbrody
Copy link
Contributor Author

stbrody commented Jul 17, 2024

hmm, have we tested this in CI since the big refactoring you did to how IOD works? Maybe that refactoring sped things up enough to make a difference?

@stbrody
Copy link
Contributor Author

stbrody commented Jul 17, 2024

I figured out the issue, the test hasn't been running since we removed the CERAMIC_RECON_MODE env variable 🤦

@stbrody
Copy link
Contributor Author

stbrody commented Jul 17, 2024

Okay even though the test itself had a timeout of 10 minutes, there was a place within the test that only had a 30 second timeout despite it basically requiring all the sync work to have already completed. Bumping that timeout up to 10 minutes as well has the test now passing in around 7 and a half minutes in CI.

On the one hand, normally I would say a single test taking over 7 minutes is too long for a unit test and is going to slow down our development velocity too much. On the other hand, js-ceramic is on the way out so there shouldn't be too much development happening against it anyway, so maybe it's okay?

On the other other hand - this is now the best coverage we have of IOD, and it's only testing against the js-ceramic layer. So once js-ceramic goes away, we lose that coverage for C1, which I don't love. Is there a way we can do a test like this within C1 somehow?

@dav1do
Copy link
Contributor

dav1do commented Jul 17, 2024

Okay even though the test itself had a timeout of 10 minutes, there was a place within the test that only had a 30 second timeout despite it basically requiring all the sync work to have already completed. Bumping that timeout up to 10 minutes as well has the test now passing in around 7 and a half minutes in CI.

On the one hand, normally I would say a single test taking over 7 minutes is too long for a unit test and is going to slow down our development velocity too much. On the other hand, js-ceramic is on the way out so there shouldn't be too much development happening against it anyway, so maybe it's okay?

On the other other hand - this is now the best coverage we have of IOD, and it's only testing against the js-ceramic layer. So once js-ceramic goes away, we lose that coverage for C1, which I don't love. Is there a way we can do a test like this within C1 somehow?

Ah, that makes sense. 7.5 minutes is a long time though. We have a similar test in rust ceramic but it's only a 1 node test. It did fail consistently before the changes to fix IOD in c1 though.

@stbrody
Copy link
Contributor Author

stbrody commented Jul 19, 2024

Closing this since the test is very slow and we have a different unit test of this functionality in the ceramic-one codebase, which is probably the better place to be testing this anyway.

@stbrody stbrody closed this Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants