Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DREAM monitor normalization #93

Merged
merged 16 commits into from
Nov 14, 2024
Merged

DREAM monitor normalization #93

merged 16 commits into from
Nov 14, 2024

Conversation

jl-wynen
Copy link
Member

Fixes #76

@nvaytet nvaytet self-assigned this Sep 23, 2024
"workflow[MonitorFilename[SampleRun]] = dream.data.simulated_monitor_diamond_sample()\n",
"workflow[MonitorFilename[VanadiumRun]] = dream.data.simulated_monitor_vanadium_sample()\n",
"workflow[MonitorFilename[BackgroundRun]] = dream.data.simulated_monitor_empty_can()\n",
"workflow[CaveMonitorPosition] = sc.vector([0.0, 0.0, -4220.0], unit='mm')\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing in ESS files for DREAM, the position of the monitor could be loaded from the nexus file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

"metadata": {},
"outputs": [],
"source": [
"result.hist().plot()"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth also comparing to the result where normalization by proton charge was used?

"""
Workflow with default parameters for the Dream Geant4 simulation.
"""
wf = LoadGeant4Workflow()
for provider in itertools.chain(powder_providers, _dream_providers):
wf.insert(provider)
run_norm.insert(wf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was quite confused by this interface. I actually thought you got the order wrong, and that it should have been wf.insert(run_norm).
Then I saw that you added a insert method to the Enum.

I was expecting this to work more like e.g. the UncertaintyBroadcastMode, which is just a parameter we set on the workflow (e.g. https://github.com/scipp/esssans/blob/main/docs/user-guide/isis/sans2d.ipynb?short_path=e98bbec#L141)?

Was there a reason I'm missing that meant you chose this interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I used run_norm.insert(wf) instead of wf.insert(run_norm) because I thought it would be good to keep the selection logic within RunNormalization. I see how the call can be confusing. I can change it to a free function. But I don't want the match statement inlined into DreamGeant4Workflow.
  2. We cannot use a parameter like UncertaintyBroadcastMode because the different normalizations need to request different data from the pipeline. Using a combined providers that picks a norm would require requesting the union of all dependencies, that is monitors and proton charge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to have a free function and remove it from the Enum?

An alternative could be to make classmethods that would generate the workflow with the correct providers?
e.g.

LoadGeant4Workflow.normalized_by_monitor_histogram()
LoadGeant4Workflow.normalized_by_monitor_integrated()
LoadGeant4Workflow.normalized_by_proton_charge()

and one of them could be the default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoadGeant4Workflow is a function, not a class. We struggled a bit with finding a good way of using a class in the past. So I'd rather stay away from this for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I forgot that it wasn't a class 🙄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

A histogrammed monitor.
wavelength_edges:
If given, rebin the monitor with these edges.
smooth_args:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you removed the smoothing of the monitor. Was that intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This should be done as a separate pipeline step if desired.

lo = det_coord.min()
hi = det_coord.max()
hi.value = np.nextafter(hi.value, np.inf)
monitor = monitor[dim, lo:hi]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the monitor spans a narrower range than the detector?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the normalization is off, I think.

What do you think should happen?

  1. Keep it as is.
  2. Raise an error.
  3. Crop the data to the monitor range.

This also affects normalize_by_monitor_histogram which currently just uses whatever values lookup produces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say raise an error if it doesn't already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

norm = broadcast_uncertainties(
norm, prototype=detector, mode=uncertainty_broadcast_mode
)
return NormalizedRunData[RunType](detector / norm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't need a sc.lookup here because you are broadcasting to all events, is that correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. And because norm is a scalar.

@@ -175,7 +178,7 @@ def _restore_tof_if_in_wavelength(data: sc.DataArray) -> sc.DataArray:


def add_scattering_coordinates_from_positions(
data: NormalizedByProtonCharge[RunType], graph: ElasticCoordTransformGraph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I managed to follow why this is now FilteredData... I would have expected NormalizedRunData?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Monitor normalization needs a wavelength coord. So it has to be done after add_scattering_coordinates_from_positions. For proton charge normalization, the order does not matter. I moved it to the same place as monitor normalization to make it easier to build the graph.

Comment on lines 109 to 112
# hi is shifted towards MINUS infinity because label-based indexing with bin-edges
# in inclusive for the upper edge. But we don't want the literal upper edge
# of the detector to be included.
hi.value = np.nextafter(hi.value, -np.inf)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was broken in my first implementation. I used np.nextafter(hi.value, +np.inf). Please check that the logic is sound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things: not sure why do you want to exclude the upper edge? And is label-based indexing the way to go here?

Say you have a case where hi lies right in the middle of one of the monitor bins. I think we should be including half of the counts in that bin in the normalization?
One could for example use rebin with something like bins [lo, monitor[hi:lo].coords['wavelength'][1:-1], hi] (assuming that the monitor coords don't line up exactly with hi and lo).

You could also go further and use https://scipp.github.io/generated/modules/scipy/scipp.scipy.integrate.html?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • How would you use integrate? It doesn't support bin-edge coords. And I don't know if sc.midpoints is the right thing to use for log coords.
  • I tried rebin. It can basically reproduce the result from slicing:
sliced = da['x', lo:hi]
sliced_norm = sc.sum(sliced.data)

rebinned = da.rebin(x=target_bins)
rebinned_norm = sc.sum(rebinned.data)

But this does not work when the bin width is included as sc.sum(da.data * (coord[1:] - coord[:-1])) because of how rebin distributes weights. So this would need some extra code to take that into account.

However, I agree that, in your example, we should include part of the upper bin. Do you see a simple way of doing so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion from offline discussion was: rebin actually works.

@@ -28,6 +29,7 @@
EmptyBeamRun = reduce_gt.EmptyBeamRun
Filename = reduce_gt.Filename
Incident = reduce_gt.Incident
CaveMonitor = reduce_gt.Monitor1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considering naming this by what is does instead of where it is?

Comment on lines +321 to +339
class RunNormalization(enum.Enum):
"""Type of normalization applied to each run."""

monitor_histogram = enum.auto()
monitor_integrated = enum.auto()
proton_charge = enum.auto()


def insert_run_normalization(
workflow: sciline.Pipeline, run_norm: RunNormalization
) -> None:
"""Insert providers for a specific normalization into a workflow."""
match run_norm:
case RunNormalization.monitor_histogram:
workflow.insert(normalize_by_monitor_histogram)
case RunNormalization.monitor_integrated:
workflow.insert(normalize_by_monitor_integrated)
case RunNormalization.proton_charge:
workflow.insert(normalize_by_proton_charge)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I lack some context, but I am not convinced by the design choice here. Why do we need an enum which then decides which providers to insert? Wouldn't it be more natural and a more modular design if we had a small normalization workflow (or 3 actually) which can be passed to (dependency injected into) the main workflow creation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be great. But it requires scipp/sciline#180

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could do this, would you require the users to know where to insert the workflow? I.e.

wf = DreamWorkflow()
norm = MonitorHistogramNormalization()
wf[NormalizedRunData[RunType]] = norm

Or do we still need a helper function?

wf = DreamWorkflow()
norm = MonitorHistogramNormalization()
insert_run_normalization(wf, norm)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither:

wf = DreamWorkflow(normalization_workflow=MonitorHistogramNormalization())

or something like that.

We should look into scipp/sciline#180 then, before adding workarounds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As upstream changes ran into trouble, I suggest we keep this as it is.

src/ess/powder/correction.py Outdated Show resolved Hide resolved
@jl-wynen jl-wynen enabled auto-merge November 14, 2024 15:57
@jl-wynen jl-wynen merged commit 5a5ca40 into main Nov 14, 2024
4 checks passed
@jl-wynen jl-wynen deleted the monitor-norm branch November 14, 2024 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Requirement] Normalise by monitor
3 participants