Generated test suite with cover option contains failing tests #188

azewiusz · 2022-11-17T22:40:09Z

Expected vs actual behavior

For certain complexity of program logic we can get cover option to generate unit tests (it may require more time but generates them), once these tests are ran I get assertion failures as if the calculated execution path parameters were not leading to successfully passing of assertion.

To Reproduce
I created git repo https://github.com/azewiusz/for-crosshair where I describe how to reproduce this problem.
I'm using 0.0.32 version of crosshair-tool.

pschanely · 2022-11-21T13:14:40Z

Thank you for this issue and detailed repro code!
I am having a little trouble getting your results immediately on my mac; looks like you're on Windows, correct? And which version of Python are you using?

I could imagine some challenges using floats in particular with the cover command, but I want to make sure I understand your issue exactly before I get into those details. :)

azewiusz · 2022-11-21T13:46:29Z

Yes, I'm using cover on windows machine, It may be that my python was updated to v 3.9.10 on the test workstation and it was likely 3.7 before (at a time when I reproduced this issue). Also, after updating to latest 0.0.34 version of crosshair tool this error is gone, so, it was reproducible for sure many times, but only on v0.0.32

pschanely · 2022-11-21T20:45:51Z

Hmm, ok, my attempt at windows + Python 3.7 + CrossHair 0.0.32 produced a successfully running test case too. That said, I know that over the last few months, we've fixed a handful of issues with cover and diffbehavior, so I still think I'm inclined to chalk it up to one of those.

Now, real talk, one big gotcha with CrossHair and floats: For performance reasons(*) CrossHair approximates floating point behavior using true (arbitrary precision) real numbers. Therefore, it's possible to get coverage cases that are on rounding boundaries and fail to re-execute the expected path.

(*) Z3 is technically capable of doing floating-point-accurate symbolic execution. However, those capabilities are very slow; it might take O(minutes) to reason about a single floating point operation. In an ideal world, we might try both approaches, but I haven't invested much into this idea, as I haven't seen the problem come up too often in practice. (but that's also why it's so important that people file bugs when things don't work, like you have! Thank you!)

azewiusz · 2022-11-21T20:58:41Z

I spent on this some time today (tried roll back from 0.0.34 to 0.0.32) but I failed to recreate. I potentially lost exact set-up during experimentation.
I think what you write is important for the case that I was trying to work with (the floating point accuracy).
Thank you for your investigation.

pschanely · 2022-12-14T19:45:50Z

I think what you write is important for the case that I was trying to work with (the floating point accuracy).

Ah, then I will count this as a vote in favor of working on a feature that also attempts fully accurate floating-point alongside the implementation based on Real numbers!

pschanely mentioned this issue Nov 16, 2023

Implement true floating point semantics #230

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generated test suite with cover option contains failing tests #188

Generated test suite with cover option contains failing tests #188

azewiusz commented Nov 17, 2022

pschanely commented Nov 21, 2022

azewiusz commented Nov 21, 2022

pschanely commented Nov 21, 2022

azewiusz commented Nov 21, 2022

pschanely commented Dec 14, 2022

Generated test suite with cover option contains failing tests #188

Generated test suite with cover option contains failing tests #188

Comments

azewiusz commented Nov 17, 2022

pschanely commented Nov 21, 2022

azewiusz commented Nov 21, 2022

pschanely commented Nov 21, 2022

azewiusz commented Nov 21, 2022

pschanely commented Dec 14, 2022