Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sciline #7

Merged
merged 16 commits into from
Dec 7, 2023
Merged

Use sciline #7

merged 16 commits into from
Dec 7, 2023

Conversation

jl-wynen
Copy link
Member

This is a first draft of a Sciline pipeline. Can you take a look if this is a proper usage of Sciline? I'm going to add tests and more documentation later.

I tested it using

import scipp as sc
import sciline
import ess.diffraction
from ess.diffraction.external import powgen
from ess.diffraction import powder
import scippneutron as scn
from ess.diffraction.grouping import group_by_two_theta, merge_all_pixels

from ess.diffraction.types import *

params = {
    CalibrationFilename: 'PG3_FERNS_d4832_2011_08_24.zip',
    DspacingBins: sc.linspace('dspacing', 0.0, 2.3434, 200, unit='angstrom'),
    Filename[SampleRun]: 'PG3_4844_event.zip',
    Filename[VanadiumRun]: 'PG3_4866_event.zip',
    TwoThetaBins: sc.linspace(dim='two_theta', unit='deg', start=25.0, stop=90.0, num=16),
    ValidTofRange: sc.array(dims=['tof'], values=[0.0, 16666.67], unit='us'),
}
providers = [
    *ess.diffraction.providers,
    *powder.providers,
    *powgen.providers,
]
# providers.remove(merge_all_pixels)
# providers = (*providers, group_by_two_theta)
pl = sciline.Pipeline(
    providers,
    params=params,
)
result = pl.compute(DspacingHistogram)

And compared to https://scipp.github.io/ess/instruments/external/powgen/powgen_reduction.html#Vanadium-correction
The results don't quite agree. This is because the old workflow did not crop the vanadium data in tof but the new pipeline does. Without this change, the results are identical.

@SimonHeybrock
Copy link
Member

Looks good. Can you show the resulting graph?

The results don't quite agree. This is because the old workflow did not crop the vanadium data in tof but the new pipeline does. Without this change, the results are identical.

Why was the cropping added? Is this a new requirement, or was it an oversight in the old workflow?

@jl-wynen
Copy link
Member Author

Why was the cropping added? Is this a new requirement, or was it an oversight in the old workflow?

I think it was an oversight in the old workflow. I see no reason why the vanadium run would be able to access a higher tof range. In Mantid, the tof bounds are obtained from an isntrument characterization file. Since we don't have those, I hard coded the bounds in the workflow. But then forget to apply them to vanadium.

@jl-wynen
Copy link
Member Author

graph

@SimonHeybrock
Copy link
Member

👍

@jl-wynen jl-wynen marked this pull request as ready for review November 29, 2023 10:20
@jl-wynen
Copy link
Member Author

Ready now. Fixes #3

Comment on lines 76 to 77
data: MergedPixels[SampleRun],
vanadium: MergedPixels[VanadiumRun],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in powder, the reduction (pixel merging) is referred to as "focussing", I don't know if that name should be adopted here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably

Comment on lines +93 to +95
return TofCroppedData[RunType](
data.bin(tof=tof_range.to(unit=data.coords['tof'].unit))
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note https://scipp.github.io/user-guide/binned-data/filtering.html#Extract-events-falling-into-a-parameter-interval, which would avoid the need for tof binning, i.e., adding tof as a dim.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work. It seems to do nothing:

Screenshot_20231205_160814

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot a .bins

:
`data` with bad events removed.
"""
# TODO this needs to filter by proton charge once we know how
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new or old todo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't written down before. I added this provider in order to define a FilteredData type and make the tof-cropping explicit in the graph. Alternatively, I could drop this provider and make crop_tof return FilteredData.

Comment on lines +58 to +60
def finalize_histogram(
data: NormalizedByVanadium, edges: DspacingBins
) -> DspacingHistogram:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like we are normalizing in event mode, which is bad for error handling. Should histogram before vanadium division?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I didn't want to alter the workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open an issue at least?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 44 to 45
TwoThetaBins = NewType('TwoThetaBins', sc.Variable)
"""Bin edges for 2theta."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is for grouping by scattering angle, instead of into a single spectrum? Can you document this?

Comment on lines 60 to 63
# This is Mantid-specific and can probably be removed when the POWGEN
# workflow is removed.
DetectorInfo = NewType('DetectorInfo', sc.Dataset)
"""Mapping between detector numbers and spectra."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be moved into external/powgen then?

"""Raw data."""


class RawDataWithvariances(sciline.Scope[RunType, sc.DataArray], sc.DataArray):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class RawDataWithvariances(sciline.Scope[RunType, sc.DataArray], sc.DataArray):
class RawDataWithVariances(sciline.Scope[RunType, sc.DataArray], sc.DataArray):

Comment on lines 8 to 18
def drop_variances(data: RawDataWithvariances[VanadiumRun]) -> RawData[VanadiumRun]:
res = data.copy(deep=False)
if res.bins is not None:
res.bins.constituents['data'].variances = None
else:
res.variances = None
return RawData[VanadiumRun](res)


providers = (drop_variances,)
"""Sciline providers for handling statistical uncertainties."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the vanadium errors dropped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we normalise in event mode and that requires broadcasting a vanadium histogram to sample events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should not do that?

Comment on lines +85 to +86
" *ess.diffraction.providers,\n",
" *powder.providers,\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit confusing, powder is a submodule of ess.diffraction? What is the split? Do you want to explain to the reader?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the goal was to also have single crystal code in this project. So I kept the old module structure from ess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are things that are common between powder and SXD?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I thought that was the reason you suggested a common package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is: You have ess.diffraction.providers, implying there is something common?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Many providers are regular functions that were already defined in the common diffraction part of ess. I don't know whether they can be used by SXD. Esp. as providers because the domain types are probably all different.

So what do you suggest? Move everything into powder? Or remove powder and move everything into diffraction?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can leave it, but maybe add a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened an issue to track this and make us re-examine when we add sxd code.

"\n",
"We ultimately need to write the reduced data to a file.\n",
"This could be done with the `result` we computed above.\n",
"But then we would have to provide all metadata manually.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose there is no metadata here? This is a bit confusing maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filename, sort of. But yes, there isn't any real metadata. I can change it now and revert later when we write a cif file instead.

Comment on lines 301 to 316
"The above pipeline merges all instrument pixels to produce a 1d d-spacing curve.\n",
"If instead we want to group into $2\\theta$ bins, we can alter the pipeline by replacing the grouping step:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d24c8bf8-ad59-4211-ae6a-3ee29b0556a3",
"metadata": {},
"outputs": [],
"source": [
"from ess.diffraction.grouping import group_by_two_theta, merge_all_pixels\n",
"\n",
"grouping_providers = list(providers)\n",
"grouping_providers.remove(merge_all_pixels)\n",
"grouping_providers = (*grouping_providers, group_by_two_theta)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to have TwoThetaBins as an optional input to a single provider, which falls back to merging all pixels of no bins provided? This would avoid the need to modify the provider list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. What I don't like about it (and the current approach) is that the dims of the result are different. So downstream providers cannot necessarily use the data. Swapping a providers is more explicit about this change than adding a parameter.

But maybe the two providers should return different types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, swapping a provider does not fix the problem, so a different type would work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my discussion with Andrew about CIF, it seems that we will have to focus in some way that still needs to be determined. It sounds like grouping all pixels is quite useless. But using 2theta is likely also not a good approach, at least for DREAM.

So I wouldn't put too much thought into finding a good solution now. Since this is not the last step in the pipeline, I'd rather keep the unified output type until we know more.

@jl-wynen
Copy link
Member Author

jl-wynen commented Dec 6, 2023

I think I addressed your comments. Can you have another look, @SimonHeybrock ?

@jl-wynen jl-wynen merged commit c54508a into main Dec 7, 2023
3 checks passed
@jl-wynen jl-wynen deleted the use-sciline branch December 7, 2023 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants