Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use time-of-flight workflow in Dream data reduction #125

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

nvaytet
Copy link
Member

@nvaytet nvaytet commented Jan 27, 2025

In this PR, we incorporate the time-of-flight computation workflow into the DREAM data reduction workflow.

This results in much improved results where peaks in d-spacing are narrow and symmetrical, and lines are vertical along the 2theta dimension.

Before:
Screenshot_20250130_131340
Screenshot_20250130_131354

After:
Screenshot_20250130_131428
Screenshot_20250130_131438

@nvaytet nvaytet marked this pull request as ready for review January 30, 2025 12:15
@nvaytet nvaytet requested a review from jl-wynen January 30, 2025 12:33
@nvaytet nvaytet marked this pull request as draft January 30, 2025 12:35
@nvaytet nvaytet marked this pull request as ready for review January 30, 2025 14:55
"metadata": {},
"outputs": [],
"source": [
"grouped_dspacing = workflow.compute(IofDspacingTwoTheta)\n",
"grouped_dspacing = workflow.compute(IofDspacingTwoTheta).bins.concat(\"event_time_zero\")\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the result has a new dimension?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we expect NeXus data to have the event_time_zero dimension, because it records events from more than one pulse.
The data we have in the geant4/mcstas files isn't very realistic, it has values for tof that exceed 71ms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't the time have been removed at some point during conversion to d-spacing?

# Add a event_time_zero coord for each bin, but not as bin edges,
# as all events in the same pulse have the same event_time_zero, hence the `[:2]`
da.coords["event_time_zero"] = (
sc.scalar(1730450434078980000, unit="ns").to(unit="us") + da.coords["tof"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a random date in the past. The actual value does not matter.
I added a comment.


period = (1.0 / sc.scalar(14.0, unit="Hz")).to(unit="us")
# Bin the data into bins with a 71ms period
da = da.bin(tof=sc.arange("tof", 3) * period)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 2 bins? Does it just so happen that the simulation contains 2 pulses?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is 2 bins because the tofs spill over into the next pulse, but not in the one after that.
But in principle, you could have that in another file.
I changed to compute a npulses based on the max tof and the period.

def compute_detector_time_of_flight(
detector_data: DetectorData[RunType], tof_workflow: TofWorkflow
) -> TofData[RunType]:
wf = tof_workflow.pipeline.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is this? Does it copy the cached intermediate results?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always assumed it made a cheap shallow copy?
I think it's using https://networkx.org/documentation/stable/reference/classes/generated/networkx.Graph.copy.html?

Comment on lines 259 to 264
def set_monitor_zeros_to_nan(
monitor: TofMonitorData[RunType, MonitorType],
) -> TofMonitorDataZerosToNan[RunType, MonitorType]:
inds = monitor.values == 0.0
monitor.values[inds] = np.nan
return TofMonitorDataZerosToNan[RunType, MonitorType](monitor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at the discussion about granularity, can this be merged with the above provider? Do we need to expose the data without NaNs here? Or should it be part of convert_monitor_to_wavelength?

Also, this provider modifies its input which can lead to surprises down the line. Avoiding this would be easiest by merging it into compute_monitor_time_of_flight.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I changed my mind at least twice on this one.
In the end, I made it into separate steps because I thought maybe not all workflows or instruments would want to NaN the zero counts, and it would make it easier to insert a different provider.

But I agree about making the graph more messy and modifying the data in place.

detector_data:
Data from the detector.
"""
return TofData[RunType](detector_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this dummy provider by changing extract_raw_data to return TofData instead of DetectorData?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants