Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DISF+DCSF summation #619

Open
wants to merge 36 commits into
base: protos
Choose a base branch
from

Conversation

MBartkowiakSTFC
Copy link
Collaborator

Description of work
Neutron total dynamic structure factor analysis does not test if the input datasets (DCSF and DISF) were produced with the correct weights applied. As a result, the datasets frequently end up having the weights applied twice.

Fixes

  • Re-enabled the NDTSF analysis
  • Check what weights were used to produce the input datasets. If "equal" apply weights, if "b_coherent" and "b_incoherent2" don't apply weights, reject other inputs.

To test
Try reproducing a known result of dcsf/disf calculation and add the results to each other using the neutron total dynamic structure factor analysis.

@ChiCheng45
Copy link
Collaborator

I found some inconsistency when different equal or b_inc2/b_coh weighs are used. Here are some examples with the Ar36 trajectory.

DCSF (equal) / DISF (equal)
image

DCSF (b_coh) / DISF (b_inc2)
image

DCSF (equal) / DISF (b_inc2)
image

DCSF (b_coh) / DISF (equal)
image

@gonzalezma
Copy link
Collaborator

I am not sure how this is intended to work. Originally, the NTDSF analysis was introduced to compute a signal as similar as possible to the measured one, i.e. something similar to Sum_ij b_ib_jsqrt(c_ic_j)Scoh_ij + Sum_i b_ic_iSinc_i, with output units in barn/sterad (or fm^2/sterad). By providing also the different contributions, it was then possible to determine the relative importance of the coherent and incoherent contributions, etc. As the DCSF and DISF are normalized, one needs to be careful when summing their results to give the NTDSF. I agree that it is possible, but my advice would be to use the partials of both analysis (so it does not matter the weighting scheme used in their calculation) and recalculate all the needed sums.

As for the issue with the Argon simulation, if the element used in the analysis is Ar36, its b_incoherent is null, so I wonder if this could not affect the tests that you are trying to do.

@MBartkowiakSTFC
Copy link
Collaborator Author

@gonzalezma Let's start the discussion of this PR with this paper https://journals.aps.org/pra/abstract/10.1103/PhysRevA.6.1107 where the coherent and incoherent contributions to the total S(q,w) are described for a mixture of gases (but mainly argon). The S_inc shows the strongest signal at 10 nm^-1 and the S_coh at 20 nm^-1, the respective intensities given as 2.6 and 2.4 meV^-1. So, the coherent and incoherent contributions to S(q,w) are expected to be of comparable intensity.
Using the same scattering lengths as those in the paper, and a trajectory of argon, I can calculate the DCSF and DISF in MDANSE. The Q dependence of the signal at 0 energy transfer is shown here:
Sqw_vs_Q_separate
The maxima of the two curves are still comparable.

If I use the existing main branch to combine the DCSF and DISF, the result looks like this:
Sqw_vs_Q_protos_merged
The scale of the signal is different, and the incoherent contribution is gone.

The code proposed in this PR takes the same DCSF and DISF, and produces the following result:
Sqw_vs_Q_newbranch_merged

It appears to me that the result produced by the main branch is not correct. While it is possible that the calculation implemented in this PR is not optimal, it still seems like an improvement over the previous version. What is your opinion? What result would you expect from the NDTSF analysis in this situation?

@MBartkowiakSTFC
Copy link
Collaborator Author

@ChiCheng45 I was getting a similar result at some point, but then I noticed that I was using an artificial element and the value of 'equal' in the table was set to 0 for this one. All the other elements have 'equal' set to 1. I think we should have 'equal' always evaluate as 1, and not take it from the table.

Here is my comparison of the b_coh/b_incoh2 case and the equal/equal case. The results are still different, but this could also be due to the different q vectors, insufficient trajectory length, system size, etc. Blue curve has DCSF and DISF calculated using 'equal' weights and then summed up. Orange used b_coh, b_incoh2 for DCSF and DISF.
ndtsf_from_equal

@gonzalezma
Copy link
Collaborator

I don't know what it is implemented in the main branch, so I would need to look at the code. In MDANSE 1.5 this was an analysis on its own, while now I think that it is a "secondary" analysis that uses as input the output of the DISF and DCSF analysis that should be run before. Correct me if I am wrong.

If this is the case, I don't have anything against this approach, but I don't understand the use of the weighting. My comment was to indicate that it should not matter which weighting is used in calculating DISF and DCSF, if one uses the partials and not the total DISF and total DCSF.

As for your example, my answer to what result I would expect for the NTDSF is "it depends". E.g. if I assume that the sample is Ar36, the incoherent cross section is null, so the old result where NTDSF = DCSF, with no incoherent contribution seems right. But if you assume that you have natural Ar, then the incoherent contribution should be visible. In fact, I don't think that a monoatomic system such as Ar is the best one to test this. I would suggest using water and testing the total S(Q,w) and their corresponding coh/inc contributions for H2O and D2O.

Copy link
Collaborator

@ChiCheng45 ChiCheng45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested with the latest version and as far as I can tell the results are consistent now when you use equal or scattering length weights or a mixture of the two.
image

@ChiCheng45
Copy link
Collaborator

Just wondering was self._outputData["f(q,t)_coh_%s%s" % pair] and self._outputData["f(q,t)_inc_%s" % element] normalized by 1/np.sqrt(ni * nj) and 1/ni before this part of the code?

bi = ATOMS_DATABASE.get_atom_property(pair[0], "b_coherent")
bj = ATOMS_DATABASE.get_atom_property(pair[1], "b_coherent")
ni = nAtomsPerElement[pair[0]]
nj = nAtomsPerElement[pair[1]]
ci = ni / nTotalAtoms
cj = nj / nTotalAtoms
self._outputData["f(q,t)_coh_weighted_%s%s" % pair][:] = (
self._outputData["f(q,t)_coh_%s%s" % pair][:]
* np.sqrt(ci * cj)
* bi
* bj
)

bi = ATOMS_DATABASE.get_atom_property(element, "b_incoherent2")
ni = nAtomsPerElement[element]
ci = ni / nTotalAtoms
self._outputData["f(q,t)_inc_weighted_%s" % element][:] = (
self._outputData["f(q,t)_inc_%s" % element][:] * ci * bi
)

@MBartkowiakSTFC
Copy link
Collaborator Author

Just wondering was self._outputData["f(q,t)_coh_%s%s" % pair] and self._outputData["f(q,t)_inc_%s" % element] normalized by 1/np.sqrt(ni * nj) and 1/ni before this part of the code?

In the protos branch I'm pretty sure this normalisation would be applied twice. In this PR I do not apply the normalisation to the partials. Instead, the scaling factor is calculated and stored as a dataset attribute, so the plotter can apply the scaling if needed, but the data array itself is written to the file unchanged. So, the partials read by the NDTSF job are taken not normalised, and then the normalisation by 1/np.sqrt(ni * nj) and 1/ni is applied.

@MBartkowiakSTFC MBartkowiakSTFC marked this pull request as draft January 31, 2025 12:07
@ChiCheng45
Copy link
Collaborator

ChiCheng45 commented Feb 4, 2025

I think a test to check that this is working would be to generate a liquid argon trajectory and use the Ar scattering lengths to calculate the neutron total dynamic structure factor. Using the same trajectory assign the atoms to Ar36, Ar38, and Ar40 randomly in their natural abundances. There will be no incoherent scattering but this should get absorbed into the coherent part. If the neutron total dynamic structure factor from the first calculation is equal to the coherent scattering in the 2nd then this PR should be correct (we might need a big system for this to work).

Here is my reasoning for why I think this. The intermediate scattering functions are

image

image

In the second calculation, we will specify the isotopes so we use the scattering lengths for Ar36, Ar38, and Ar40. To be precise the scattering functions are now

image

image

The incoherent part disappears. However, what we have done is absorbed the incoherent part into the coherent as the coherent part contains the diagonal part as well as the off-diagonal.

image

If this system is large and the isotopes are placed randomly then I should be able to find situations where the value of the correlation functions in the summation are approx the same at least for that specific choice of q and t.

I can now group the terms up so that I can write something like this.

image

If I change the summation over $K \neq L$ to include the diagonal part I must remove it again.

image

which is the sum of the coherent and incoherent parts at the top. Does this make sense?

@ChiCheng45
Copy link
Collaborator

ChiCheng45 commented Feb 13, 2025

OK, I understand now what you mean. But I am not sure that things "automatically" work, as you seem to suggest. First of all, when you have spin incoherence (e.g. with H or D), I do not think that your reasoning applies. And in this case (Ar), I think that it may be correct, but there is something that I do not understand in your signals. If they correspond to the total scattering (i.e. not normalized), the area under the 36Ar S(Q,w) should be about 50 times larger than the area under the 38Ar S(Q,w), as this is the ratio between their total scattering cross sections (77.9/1.5). And then, from the numbers I wrote above, the 50/50 mix should have an area about half of that of 36Ar.

I think it should still be the case with spin incoherence the arguments I made with the math above should still hold, we can check this with a calculation if you like.

For the second issue, I think the similarity between the 36Ar and 38Ar might be down to the weighting scheme used in MDANSE. The scattering lengths get divided by the sum of the scattering lengths of the system. For the 50/50 mix, there is incoherent scattering so we get that large peak. The plot above is not the total (doesn't have the proper scaling because of MDANSEs weight scheme) it's the result from the DCSF calculation, I wanted to show that you get incoherent scattering in the DCSF result when I set the isotopes manually with 50/50.

@gonzalezma
Copy link
Collaborator

Not sure for the spin incoherence. I will let you do the maths and convince me ;-)

As for the second point, there are two things that I do not understand:

  1. I thought that the whole point of this analysis was to give the scattering in absolute units (so no normalization by the sum of the scattering lengths).
  2. What's the argument to interpret all the additional signal as incoherent scattering?

@ChiCheng45
Copy link
Collaborator

ChiCheng45 commented Feb 13, 2025

  1. I only calculated DCSF as an example to show the incoherent signal gets added to the DCSF results when isotopes are set manually, I agree once complete the total shouldn't be using rescaled scattering lengths
  2. It isn't completely incoherent but should be mostly incoherent since it's low q and it is so much larger than either pure Ar36 or Ar38.

# Conflicts:
#	MDANSE/Src/MDANSE/Framework/Jobs/DynamicCoherentStructureFactor.py
#	MDANSE/Src/MDANSE/Framework/Jobs/DynamicIncoherentStructureFactor.py
#	MDANSE/Src/MDANSE/Framework/Jobs/GeneralAutoCorrelationFunction.py
#	MDANSE/Src/MDANSE/Framework/Jobs/NeutronDynamicTotalStructureFactor.py
#	MDANSE/Src/MDANSE/Framework/Jobs/PairDistributionFunction.py
#	MDANSE/Src/MDANSE/Framework/Jobs/PositionAutoCorrelationFunction.py
#	MDANSE/Src/MDANSE/Framework/Jobs/VanHoveFunctionDistinct.py
#	MDANSE/Src/MDANSE/Framework/Jobs/VelocityAutoCorrelationFunction.py
#	MDANSE/Src/MDANSE/Mathematics/Arithmetic.py
# Conflicts:
#	MDANSE/Tests/UnitTests/test_converter.py
@ChiCheng45
Copy link
Collaborator

ChiCheng45 commented Feb 20, 2025

Here is the total scattering with a trajectory with 50/50 mix of randomly set 36Ar and 38Ar and an Ar_Mix atom which I created with b_coh = 14.2E-05 and b_inc = 10.7E-05.

image

As you can see despite b_inc = 0 for both 36Ar and 38Ar, the incoherent scattering got added into the DCSF calculation. Small differences between the two are due to the short trajectory I used which means there is a bit of noise.

image

I think the above results suggest that we are adding things correctly. I've also removed the copying of the DISF and DCSF results into the total mda as I think you can end up with a lot of different results in the mda file especially when there are a lot of atom types.

I've also added a new feature so that there is a scaling setting in the plotter so that the partials are not scaled, this will mean that the partials will be the same as the old MDANSE, something similar to this was suggested in #353. This should also set it up so we can resolve #669 with #550 by using different scaling/weight schemes.

image

image

Here I used equal weights.

@ChiCheng45 ChiCheng45 marked this pull request as ready for review February 20, 2025 09:18
@ChiCheng45 ChiCheng45 requested a review from oerc0122 February 20, 2025 10:13
@ChiCheng45 ChiCheng45 requested a review from oerc0122 February 20, 2025 17:33
Copy link
Collaborator

@oerc0122 oerc0122 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from a minor hopefully clearer (hopefully correct) stylistic thing, if this does what it sets out to functionality-wise, I'm happy for it to go in.

@MBartkowiakSTFC
Copy link
Collaborator Author

I have tried out the new behaviour of the GUI. We can still load the old results (i.e. from previous versions) and the new results have the extra option of reverting to the unweighted numbers, which is good. Thanks for the changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants