-
-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: run all tests with a series of seeds to help find unlikely bugs faster #463
Comments
I realise that the two examples I pasted above were bugs in test code, and not in garble itself, but I still think this is important to ensure the software is stable. For instance, in #461 (comment) I ran into a bug that only triggered with one literal obfuscator, so I had to keep running the tests with different seeds to get lucky. I very nearly submitted a PR with the bug because I initially had not noticed it; having CI triple-check many seeds would have been nice. |
Speaking of which, |
We've had a handful of bugs in the past which only reproduced with a particular seed, such as 716f007 or d8e5351.
These can happen because we obfuscate code in different ways depending on the seed. For instance, a bug might only appear when one of the literal obfuscators is chosen on a particular string; if we have 8 such literal obfuscators, one single seed from a
go test
run has 12.5% chance of triggering the failure.CI does tend to catch these unlikely flakes at some point; each CI run spins up the tests on three platforms, and we also run jobs on master after merge, so each test gets at least 6 different test runs. Still, multiple flakes have gone unnoticed for weeks or months in the past.
To catch them faster, I propose that we add a
test-seeds
CI job that runs all the tests with a series of "default seeds". For example, it could be 32 seeds ranging from-seed=001122334400
to-seed=00112233441A
. We could teach the tests to inject such a default seed any time a test runsgarble build
without specifying a seed, and we could pass it around via an env var.This still wouldn't make flakes impossible, but if we test with 32 seeds rather than 1, then we're roughly 32 times more likely to find such flakes.
The number of seeds to test with is up for debate. 32 is probably a bit optimistic; the CI job could take a long time. Perhaps we could start with 10.
Something else we could do is use
go test -short
on those seeds; most integration tests do their basic testing as part of-short
, and do extra expensive tests such as double-checkinggo build
as part of the non-short build. Meaning that we'd basically already get all the benefit of many seeds with-short
, but testing with each seed would be significantly faster.The text was updated successfully, but these errors were encountered: