-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export dataset according to NXLauetof application definition #101
base: main
Are you sure you want to change the base?
Conversation
e1c47d4
to
08007e5
Compare
src/ess/nmx/nexus.py
Outdated
|
||
def _add_lauetof_definition(nx_entry: h5py.Group) -> None: | ||
nx_entry["definition"] = "NXlauetof" | ||
nx_entry["definition"].attrs["NX_class"] = "NX_CHAR" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think fields in NeXus should have an NX_class
attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was specified in the application definition but I wasn't sure if it's the right way. Do you think we should remove them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't see it here: https://manual.nexusformat.org/classes/applications/NXlauetof.html. Can you give a reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah it was DTYPE not NX_class sorry. I was confused cause they were written next to the field... I removed it.
def export_panel_dependent_data_as_nxlauetof( | ||
*dgs: sc.DataGroup, output_file: str | pathlib.Path | io.BytesIO | ||
) -> None: | ||
with h5py.File(output_file, "r+") as f: | ||
nx_instrument: h5py.Group = f["entry/instrument"] | ||
for dg in dgs: | ||
_add_lauetof_detector_group(dg, nx_instrument) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. If dgs
contains all panels, we still use 3x the memory (leaving aside intermediate steps). Wasn't the point to have a writer that works one panel at a time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you might also want to write them all at the same time, if they are not too large.
For example, you need all three panels at the same time if you want to see all three of them in an instrument view anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I thought you might also want" is often an indicator that it should be postponed until we know it is really needed. Otherwise you need to refactor three times and write tests. And I am not sure why the instrument view is related to this.
In any case: How do you imagine the interface for writing multiple panels to the same file when they are not in memory at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I am not sure why the instrument view is related to this.
After using instrument view, there's no point of iterating it again and you can just save them all at once.
In any case: How do you imagine the interface for writing multiple panels to the same file when they are not in memory at the same time?
The idea was to save panel independent data first always and then append panel-dependent data into the same file.
file_name = "test.nxs"
export_panel_independent_data_as_nxlauetof(binned_dg, output_file=file_name)
export_panel_dependent_data_as_nxlauetof(binned_dg, output_file=file_name)
del binned_dg
for i in range(1, 3):
with temp_parameter(wf, DetectorIndex, i) as temp_wf:
reduced_data = temp_wf.compute(NMXReducedData)
export_panel_dependent_data_as_nxlauetof(reduced_data, output_file=file_name)
del reduced_data
And if you don't have problem having all three of them in the memory, you can just:
dg, dgs = all_data_groups # After collecting computed results
export_panel_independent_data_as_nxlauetof(dg, output_file)
export_panel_dependent_data_as_nxlauetof(dg, *dgs, output_file=output_file)
which can be wrapped into a single interface like:
Lines 292 to 301 in dce3496
def export_as_nxlauetof( | |
dg: sc.DataGroup, *dgs: sc.DataGroup, output_file: str | pathlib.Path | io.BytesIO | |
) -> None: | |
"""Export the reduced data into a nxlauetof format. | |
Exporting step is not expected to be part of sciline pipelines. | |
""" | |
export_panel_independent_data_as_nxlauetof(dg, output_file) | |
export_panel_dependent_data_as_nxlauetof(dg, *dgs, output_file=output_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I have no objection to make it available for only single panel at once.
I'll update it then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the panel-independent data need the workflow? If so, how do we handle that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
panel-independent data need the workflow
Monitor(control) group might need its own workflow, (we haven't seen any use case of monitor data yet)
but other than that, all other fields only needs a loader.
But also same information is available from the reduced data.
So I was planning to retrieve information from the reduced data.
Do you think we should just retrieve them directly from an input file...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was mainly wondering how we can design the interface to avoid user error. If the panel-independent data is obtain from the wrong workflow, or if someone tries to write panels from incompatible workflows (like different params/settings) to the same file.
Related to #60
This PR only contains place holders.
Due to memory issue, we decided to split exporting function into panel-dependent one and panel-independent one.