Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to fix non-deterministic tests #160

Merged
merged 4 commits into from
Jan 31, 2025

Conversation

Pennycook
Copy link
Contributor

@Pennycook Pennycook commented Jan 24, 2025

Related issues

Proposed changes

  • Fix a weird-looking assignment to self.tmp inside of the cbi-cov tests.
  • Remove ResourceWarning filters from all tests, so we can't ignore them.
  • Add explicit calls to cleanup() when we're finished with a TemporaryDirectory.
  • Use setUpClass and tearDownClass in cases where all tests can share the same resources.
  • Use assertCountEqual instead of assertEqual when comparing coverage.json files.

Even the old tests worked deterministically on my laptop, so I have no idea if this will actually fix anything. But the changes make sense to me, so 🤞.

EDIT: I suspect the assertCountEqual change here is the real fix, but I think we should merge the TemporaryDirectory changes as well. Even if they weren't the cause of the non-determinism in #159, the new versions of the tests are cleaner.

There were two unusual things about this test:
- We assigned a local variable to "self.tmp"; and
- We ignored a ResourceWarning.

One or both of these things could be the source of the non-deterministic
behavior we've been seeing.

Signed-off-by: John Pennycook <[email protected]>
Based on our experience with the cbi-cov tests, these could be hiding bugs.

Signed-off-by: John Pennycook <[email protected]>
Multiple tests were issuing ResourceWarnings due to relying on implicit cleanup
of a TemporaryDirectory.

I've rewritten these tests to implicitly call cleanup(), and in cases where the
TemporaryDirectory can be shared by all tests in the class I've switched over to
setUpClass and tearDownClass.

Even though we haven't seen any of these tests fail, it's possible that they
might share the same issues as the cbi-cov tests.

Signed-off-by: John Pennycook <[email protected]>
@Pennycook Pennycook added the bug Something isn't working label Jan 24, 2025
@Pennycook Pennycook requested a review from laserkelvin January 24, 2025 12:03
@Pennycook
Copy link
Contributor Author

...they're still failing, so something else is going on here.

Since the ordering of files on the file system is not guaranteed, the order of
the files in a coverage report is not guaranteed either.

Signed-off-by: John Pennycook <[email protected]>
@Pennycook
Copy link
Contributor Author

@laserkelvin - Since this started working after the assertCountEqual fix, I think it's ready to review now.

@Pennycook Pennycook linked an issue Jan 29, 2025 that may be closed by this pull request
Copy link
Contributor

@laserkelvin laserkelvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purpose of fixing CI, it seems to be working but I've made a note about refactoring to use context managers for the temporary directories to ensure that it's actually cleaned up within a defined scope, as opposed to relying on tearDown methods.

I think that would improve maintainability, although I'm not 100% sure it would be functionally different to now.

@@ -66,8 +64,8 @@ def test_compute(self):
"""Check that coverage is computed correctly."""
sys.stdout = io.StringIO()
# Create a temporary codebase to work on.
self.tmp = tempfile.TemporaryDirectory()
p = Path(self.tmp.name)
tmp = tempfile.TemporaryDirectory()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my comment - I don't know if you can be bothered to refactor it - would be to use TemporaryDirectory in a context so it does clean up after itself.

I guess you call it manually, but I don't know if there are other missed behaviors omitted if you don't rely on the context's __exit__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I did try using the context manager and honestly I couldn't really wrap my head around how it was supposed to work.

If you just want a temporary file to dump stuff in and then it goes away, the context manager approach is straightforward. But if you want to write to a file and then read from it later, you end up with this weird nesting of context managers and have to use a bunch of non-standard options. Here's an example from the documentation:

with tempfile.NamedTemporaryFile(delete_on_close=False) as fp:
    fp.write(b'Hello world!')
    fp.close()
# the file is closed, but not removed
# open the file again by using its name
    with open(fp.name, mode='rb') as f:
        f.read()

I think we'd also have to rewrite things to create and destroy the temporary files for every test, rather than re-using the same temporary files across all tests... That probably isn't that important, but it might make the tests run a little longer.

@@ -32,6 +31,11 @@ def setUp(self):
open(p2 / "quux.h", mode="w").close()
open(p2 / "README.md", mode="w").close()

@classmethod
def tearDownClass(self):
self.tmp1.cleanup()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this kind of thing is what makes it a bit clunky to me - I know that tearDown methods are meant to be called under all circumstances, pass or fail, but it makes it hard to validate behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, and I'm not opposed to finding a better way to rewrite these tests. But I'd like to defer it until a future PR, since you're not opposed -- I want to make sure that we actually did fix the issue, before spending additional time refactoring and/or rewriting tests.

@Pennycook Pennycook merged commit 8ffa6db into intel:main Jan 31, 2025
3 checks passed
@Pennycook Pennycook deleted the deterministic-tests branch January 31, 2025 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants