Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Form factor calculations are not numerically stable #213

Open
comcon1 opened this issue Feb 11, 2025 · 2 comments
Open

Form factor calculations are not numerically stable #213

comcon1 opened this issue Feb 11, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@comcon1
Copy link
Member

comcon1 commented Feb 11, 2025

You can see the discrepancy if you install numpy-2.x, and then run in unit tests:

pytest -vs test_analyze.py::test_analyze_ff

When calculated JSON is compared to PRE-calculated one, the difference is quite big - sometimes more than 1%. It has probably almost no scientific impact, but it is suspicious and I would not leave the code as it was. For the comparison: error in OP values is less than 1e-6. Now it leads to the fail of unit-tests when using numpy-2.x, and I would say that it is correct. Probably it can trigger fail in other build environments, but I hasn't seen yet.

@comcon1 comcon1 added the enhancement New feature or request label Feb 11, 2025
@comcon1 comcon1 added this to the Release for software paper milestone Feb 11, 2025
@batukav
Copy link
Collaborator

batukav commented Feb 14, 2025

Please see my comment here

Do we know which numpy version was used to create the test files? If so, we can step-by-step check for the changes that might have caused this discrepancy.

Numpy-2.x is a quite new release, I don't know if we ever tested it. Thus, it should not be used for computing databank property. Am I missing something?

@comcon1
Copy link
Member Author

comcon1 commented Feb 14, 2025

The test files are generated by numpy-1.x. However, I don't believe, that they are correct. If the numbers depends on numpy version, it clearly indicate that the algorithm is numerically unstable. So, I would say that we don't have correct tests. Because, for OP values, there is no discrepancy. Then, it's not the problem of numpy, it's more likely a problem of the algorithm, and then the current Databank data is entirely affected by this bug. The discrepancy isn't large, so there shouldn't be a huge scientific importance in fixing it. But I wouldn't let it be for a long time..

I actually remember, that @fsuarezleston investigated this code in summer 2024 when he was preparing to write his analog for monolayers, and he said that it was very messy, and he decided not to use it in his monolayer code at all. So, I'm not very surprized that this code now appears to be numerically unstable. So, I'm not 100% sure about it, but many things now point on it.

Regarding what to do: I'm a bit unsure. Probably It would be much easier to just completely rewrite the code, and then check that numpy version doesn't affect the final numbers. Because it can happen that it is easier than debugging the old code. But it's actually up to a person who will do it. I don't have time for it right now. So, if you want - you can do as you wrote: clone repo, and in two different folders run a small test job at two different environments with massive output in two parallel tabs and find where they start to differ. That's also a way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants