-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] Add hypothesis tests #3560
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3560 +/- ##
==========================================
+ Coverage 86.25% 86.31% +0.06%
==========================================
Files 55 55
Lines 6381 6381
==========================================
+ Hits 5504 5508 +4
+ Misses 877 873 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
This is very cool, thanks @bkmartinjr. I've proposed #3663 on top of this, with some suggested changes, all pretty minor. The only thing I want to flag is the added nondeterminism in CI runs (which I know is a common convention with frameworks like this). The current setup is, roughly:
Would it be better to seed part 2, so that CI runs the same "thousand" cases every time (which are known to work)? I'm guessing the answer is "no," but wanted to mention this tradeoff and make sure we're on the same page. If [ongoing fuzzing of future, unrelated PRs] might turn up novel bugs, then we should just 10x or 100x part 1 above, to catch them now (is the argument). |
We are on the same page - the whole point of this type of testing is to mix it up. Test new stuff. Because you can't afford to do exhaustive testing in CI, you do a bit (randomly), and have logging tell you how to recreate the issue (works most of the time, not all of the time). I advocate for keeping it this way, as over the long haul, you end up running far more unique tests. Info on how you recreate fails is here. The other alternative we could consider is manually running these tests on a regular basis. But history has shown that this is something that will fail, as there is a human in the loop. Long term, I think the best approach is CI, coupled with self-hosted GHA runners. With this strategy, you can run an "expensive" test on a schedule (say every Sunday night), and do a much, much more exhaustive search. late edit: this may be worth raising with the whole team, as everybody is impacted by stochastic tests |
@ryan-williams - also, a commit to main in the past few days now causes the tests to fail. I'll need to debug that before landing this PR (it may be a test bug, or a bug bug - won't know until I dig into it) Update: root cause is a regression on main, filed as sc-62887. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
Thanks @bkmartinjr !!
* factor README link * use `mode: OpenMode` instead of `mode: str` * add missing `self` params * OpenMode * rm unused `SOMAArrayStateMachine._reopen` * `s/work-arounds/workarounds/` * add/tweak type annotations * `get_entries`: return `set[str]` * parameterize `Ledger` with a `LedgerEntryType` this allows the type system to understand that e.g. the return value of a `.read()` can be a `PyDictLedgerEntry`, which can then have `to_dict()` invoked * rm unused `concurrency` fixture * rm unused imports * avoid `st` shadowing
Add an initial set of tests based upon the Hypothesis property-based testing framework. Included are:
See the README included in this PR for some other important information regarding these new tests.