-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic NeXus loaders #10
Conversation
Failing because I can't figure out how to make scippnexus store positions in the data
src/ess/reduce/nexus.py
Outdated
file_path: Union[FilePath, NeXusFile, NeXusGroup], | ||
*, | ||
detector_name: DetectorName, | ||
instrument_name: Optional[InstrumentName] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the odds that there is more than one instrument is close to zero, much more likely is more than one entry, shouldn't we have EntryName
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable
src/ess/reduce/nexus.py
Outdated
the sample NeXus group. | ||
""" | ||
with _open_nexus_file(file_path) as f: | ||
entry = f['entry'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot assume this naming. Find unique NXentry
, as suggested in the issue description.
src/ess/reduce/nexus.py
Outdated
instrument_name: Optional[InstrumentName] = None, | ||
) -> sc.DataGroup: | ||
with _open_nexus_file(file_path) as f: | ||
entry = f['entry'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dito
src/ess/reduce/nexus.py
Outdated
|
||
|
||
def extract_detector_data( | ||
detector: RawDetector, detector_name: DetectorName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detector: RawDetector, detector_name: DetectorName | |
detector: RawDetector |
I don't think we should have DetectorName
here. Instead return the event data, if found, otherwise the child data array (dense data). Reason: I don't think we should rely on the detector name being related to the name of the event-data subgroup. In the case of monitors there may be no subgroup at all. Also it is simpler to just look for the child data array(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can I identify the data? It's just one of many data arrays in the group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or can there only be one data array per detector / monitor?
src/ess/reduce/nexus.py
Outdated
|
||
|
||
def extract_monitor_data( | ||
monitor: RawMonitor, monitor_name: MonitorName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
monitor: RawMonitor, monitor_name: MonitorName | |
monitor: RawMonitor |
dito
src/ess/reduce/nexus.py
Outdated
def _extract_events_or_histogram(dg: sc.DataGroup, name: str) -> sc.DataArray: | ||
data_names = {f'{name}_events', 'data'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to remove the reliance on naming. Just look for child data arrays, prefer to return the event-data one?
src/ess/reduce/nexus.py
Outdated
loaded = cast( | ||
sc.DataGroup, _unique_child_group(instrument, nx_class, group_name)[()] | ||
) | ||
loaded = snx.compute_positions(loaded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to support the store_transform
arg. The transforms are required in SANS (for the detectors). Probably for now we can just always enable this, we can revisit later if it turns out there are issues with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are stored in the data group. Does the sans code need to change the name of the item?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to know where to find it, so we relied on a name that was defined in the default parameters.
I guess it could either be a parameter passed to this, or maybe it's enough to just have always the same name hard-coded here, but we need to make sure we never have a name clash in a file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jan-Lukas and I already agreed to just store it as "transform", and to simply raise on (unlikely) naming clash.
src/ess/reduce/nexus.py
Outdated
if len(data_arrays) > 1: | ||
raise ValueError( | ||
"Raw data loaded from NeXus contains more than one data array. " | ||
"Cannot uniquely identify the event or histogram data. " | ||
f"Got items {set(dg.keys())}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't do, unfortunately. Facilities have the tendency of storing a histogrammed "preview" of the event data alongside the events. I think we should return the events if found, and the dense data otherwise.
src/ess/reduce/nexus.py
Outdated
loaded = cast( | ||
sc.DataGroup, _unique_child_group(instrument, nx_class, group_name)[()] | ||
) | ||
loaded = snx.compute_positions(loaded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to know where to find it, so we relied on a name that was defined in the default parameters.
I guess it could either be a parameter passed to this, or maybe it's enough to just have always the same name hard-coded here, but we need to make sure we never have a name clash in a file?
|
||
|
||
def load_detector( | ||
file_path: Union[FilePath, NeXusFile, NeXusGroup], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was your plan for using these in a pipeline? Would we wrap them into other providers? I'm asking because of the Union
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works as long as any type is provided. So, this can be used in a pipeline as is. But I think all production pipelines will need different kinds of files. So we will have to wrap it.
@@ -0,0 +1,306 @@ | |||
# SPDX-License-Identifier: BSD-3-Clause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try (maybe as a pair-programming thing?) to test this on the loki workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tested and it works nicely: see scipp/esssans#114
Added a check as for transformation. When discussing with Simon, we concluded to just user hard coded names for now because the chance of a conflict is very low. We can always add an argument later if need be. |
src/ess/reduce/nexus.py
Outdated
@@ -234,7 +239,15 @@ def _load_group_with_positions( | |||
loaded = cast( | |||
sc.DataGroup, _unique_child_group(instrument, nx_class, group_name)[()] | |||
) | |||
loaded = snx.compute_positions(loaded) | |||
|
|||
transform_out_name = 'transformation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we were using transform
in ESSsans? transformation
is nearly identical to transformations
, the current name used for the subgroup holding the transformations, so this would be quite confusing. Can we stick with transform
?
Fixes #8
They are heavily based on the LOKI loaders in ESSsans and the test files used in that package.