Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create script for automated regression testing #13

Open
wants to merge 16 commits into
base: develop
Choose a base branch
from

Conversation

ilectra
Copy link
Collaborator

@ilectra ilectra commented Nov 27, 2024

Fixes #8 .
This is stil WIP, I need to

  • check that all files are copied properly to the build directory and paths are correct
  • possibly convert to pytest

But you can already use it from inside the directory that the test sits in build, after copying the file with the reference values over.
@edbennett are the values we're checking the right ones? Or do we need the palette value to a gazillion decimals?

@ilectra ilectra requested review from edbennett and qiUip November 27, 2024 12:13
@ilectra ilectra self-assigned this Nov 27, 2024
Copy link
Collaborator

@qiUip qiUip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of functionality, this works great! A few suggestions in the review.

I also noticed that the results change based on compilation options, e.g. when compiling for sp2n (--enable-Sp --enable-Nc=4) and running Test_hmc_IwasakiGauge I got different results than the same test when compiled for su3 (--enable-Nc=3). I'm not sure these values will make a lot of sense without a very strict compile and run parameters.

@edbennett
Copy link
Collaborator

Something I should have considered earlier: since this uses a Metropolis algorithm, checking the final value of the plaquette is only valid if the version being compared against accepts the change. (In the case of a rejected change, then the value of the plaquette output is the input value, and only depends on the code we are touching if it changes in such a way that the change is accepted instead, which is unlikely.)

Provided we consistently stick with the same test case (and have verified that it accepts), then I think this is fine, but we should be careful about generalising this.

@ilectra
Copy link
Collaborator Author

ilectra commented Dec 20, 2024

  • To add more parameters/references to check for: nthreads, CPUvsGPU
  • read references from command line (if possible) I don't think this is a good idea. It should be easy to add a line in the ...._expected.txt file.

SOme improvements in error checking and reporting.
Future test reference value files added, will have to be linked over explicitly here as well. autotools does not do wildcards.
@ilectra
Copy link
Collaborator Author

ilectra commented Jan 30, 2025

To be able to tell if the run was with GPU or CPU, look for nvidia::memalloc or unified memory in the output.

@ilectra
Copy link
Collaborator Author

ilectra commented Jan 31, 2025

I think this is ready for a final review. Please have a look and add test cases values that you find useful to the Test_hmc_Sp_WilsonFundFermionGauge_expected.txt file. @qiUip @edbennett @asifsamiarain

@@ -0,0 +1 @@
8.8.8.8 1.1.1.1 0.269298793 633bf471 3a22ad20
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those two expected values files have to be adjusted for the extra test parameters and line numbers. @qiUip

@ilectra
Copy link
Collaborator Author

ilectra commented Feb 7, 2025

Add usage note to script

@edbennett
Copy link
Collaborator

As you noted, a note on usage would be useful. You might want to run ruff format or similar to tidy up some very minor things. Other than that, looks good to me!

Add some more test parameters and expected values - in progress...
Modify saveInterval in hmc tests to produce outputs.
Fix number of threads to 1 if not specified.
@ilectra
Copy link
Collaborator Author

ilectra commented Mar 3, 2025

The lat.checkoint file checksum is machine dependent, therefore not appropriate for regression tests (unless we keep note of the machine where the reference values were generated as well). A better value to use for reference is LINK_TRACE, found in line 9 of the lat.checkpoint header.
Do we still want to keep the checksum for more detailed comparisons? In this case, the user would have to generate the checksum before the changes in the code they want to check for exact bit-wise identity, and then compare it with the one after. A command line option for the pytest, eg. --checksum, could take no argument to just report the checksum after the run, or take the value to compare with as an argument. Either way, it'll be a 2-step process, where the user will have to run pytest with the old version of the code to generate the checksum value, then re-compile with the new version of the code and run pytest again to do the comparison. Does this seem worth the effort @edbennett ?

@edbennett
Copy link
Collaborator

Having a tool that makes it easier to do a checksum comparison before and after a change, so you can verify that a trajectory is bitwise compatible when run on the same machine, would be useful. Whether it's worth the effort of course depends on how much effort it would be. I'd say it's worth a single-digit number of hours of effort, but not that much more than that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make automated test from example
3 participants