-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DREAM monitor normalization #93
Conversation
9e50564
to
0e1aba6
Compare
"workflow[MonitorFilename[SampleRun]] = dream.data.simulated_monitor_diamond_sample()\n", | ||
"workflow[MonitorFilename[VanadiumRun]] = dream.data.simulated_monitor_vanadium_sample()\n", | ||
"workflow[MonitorFilename[BackgroundRun]] = dream.data.simulated_monitor_empty_can()\n", | ||
"workflow[CaveMonitorPosition] = sc.vector([0.0, 0.0, -4220.0], unit='mm')\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing in ESS files for DREAM, the position of the monitor could be loaded from the nexus file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"result.hist().plot()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth also comparing to the result where normalization by proton charge was used?
src/ess/dream/workflow.py
Outdated
""" | ||
Workflow with default parameters for the Dream Geant4 simulation. | ||
""" | ||
wf = LoadGeant4Workflow() | ||
for provider in itertools.chain(powder_providers, _dream_providers): | ||
wf.insert(provider) | ||
run_norm.insert(wf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was quite confused by this interface. I actually thought you got the order wrong, and that it should have been wf.insert(run_norm)
.
Then I saw that you added a insert
method to the Enum
.
I was expecting this to work more like e.g. the UncertaintyBroadcastMode
, which is just a parameter we set on the workflow (e.g. https://github.com/scipp/esssans/blob/main/docs/user-guide/isis/sans2d.ipynb?short_path=e98bbec#L141)?
Was there a reason I'm missing that meant you chose this interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I used
run_norm.insert(wf)
instead ofwf.insert(run_norm)
because I thought it would be good to keep the selection logic withinRunNormalization
. I see how the call can be confusing. I can change it to a free function. But I don't want thematch
statement inlined intoDreamGeant4Workflow
. - We cannot use a parameter like
UncertaintyBroadcastMode
because the different normalizations need to request different data from the pipeline. Using a combined providers that picks a norm would require requesting the union of all dependencies, that is monitors and proton charge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to have a free function and remove it from the Enum?
An alternative could be to make classmethods that would generate the workflow with the correct providers?
e.g.
LoadGeant4Workflow.normalized_by_monitor_histogram()
LoadGeant4Workflow.normalized_by_monitor_integrated()
LoadGeant4Workflow.normalized_by_proton_charge()
and one of them could be the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoadGeant4Workflow
is a function, not a class. We struggled a bit with finding a good way of using a class in the past. So I'd rather stay away from this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I forgot that it wasn't a class 🙄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
A histogrammed monitor. | ||
wavelength_edges: | ||
If given, rebin the monitor with these edges. | ||
smooth_args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you removed the smoothing of the monitor. Was that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This should be done as a separate pipeline step if desired.
lo = det_coord.min() | ||
hi = det_coord.max() | ||
hi.value = np.nextafter(hi.value, np.inf) | ||
monitor = monitor[dim, lo:hi] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the monitor spans a narrower range than the detector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the normalization is off, I think.
What do you think should happen?
- Keep it as is.
- Raise an error.
- Crop the data to the monitor range.
This also affects normalize_by_monitor_histogram
which currently just uses whatever values lookup produces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say raise an error if it doesn't already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
norm = broadcast_uncertainties( | ||
norm, prototype=detector, mode=uncertainty_broadcast_mode | ||
) | ||
return NormalizedRunData[RunType](detector / norm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you don't need a sc.lookup
here because you are broadcasting to all events, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. And because norm
is a scalar.
@@ -175,7 +178,7 @@ def _restore_tof_if_in_wavelength(data: sc.DataArray) -> sc.DataArray: | |||
|
|||
|
|||
def add_scattering_coordinates_from_positions( | |||
data: NormalizedByProtonCharge[RunType], graph: ElasticCoordTransformGraph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I managed to follow why this is now FilteredData
... I would have expected NormalizedRunData
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Monitor normalization needs a wavelength coord. So it has to be done after add_scattering_coordinates_from_positions
. For proton charge normalization, the order does not matter. I moved it to the same place as monitor normalization to make it easier to build the graph.
src/ess/powder/correction.py
Outdated
# hi is shifted towards MINUS infinity because label-based indexing with bin-edges | ||
# in inclusive for the upper edge. But we don't want the literal upper edge | ||
# of the detector to be included. | ||
hi.value = np.nextafter(hi.value, -np.inf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was broken in my first implementation. I used np.nextafter(hi.value, +np.inf)
. Please check that the logic is sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things: not sure why do you want to exclude the upper edge? And is label-based indexing the way to go here?
Say you have a case where hi
lies right in the middle of one of the monitor bins. I think we should be including half of the counts in that bin in the normalization?
One could for example use rebin
with something like bins [lo, monitor[hi:lo].coords['wavelength'][1:-1], hi]
(assuming that the monitor coords don't line up exactly with hi
and lo
).
You could also go further and use https://scipp.github.io/generated/modules/scipy/scipp.scipy.integrate.html?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- How would you use
integrate
? It doesn't support bin-edge coords. And I don't know ifsc.midpoints
is the right thing to use for log coords. - I tried
rebin
. It can basically reproduce the result from slicing:
sliced = da['x', lo:hi]
sliced_norm = sc.sum(sliced.data)
rebinned = da.rebin(x=target_bins)
rebinned_norm = sc.sum(rebinned.data)
But this does not work when the bin width is included as sc.sum(da.data * (coord[1:] - coord[:-1]))
because of how rebin
distributes weights. So this would need some extra code to take that into account.
However, I agree that, in your example, we should include part of the upper bin. Do you see a simple way of doing so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conclusion from offline discussion was: rebin
actually works.
0c71161
to
149203c
Compare
src/ess/powder/types.py
Outdated
@@ -28,6 +29,7 @@ | |||
EmptyBeamRun = reduce_gt.EmptyBeamRun | |||
Filename = reduce_gt.Filename | |||
Incident = reduce_gt.Incident | |||
CaveMonitor = reduce_gt.Monitor1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considering naming this by what is does instead of where it is?
class RunNormalization(enum.Enum): | ||
"""Type of normalization applied to each run.""" | ||
|
||
monitor_histogram = enum.auto() | ||
monitor_integrated = enum.auto() | ||
proton_charge = enum.auto() | ||
|
||
|
||
def insert_run_normalization( | ||
workflow: sciline.Pipeline, run_norm: RunNormalization | ||
) -> None: | ||
"""Insert providers for a specific normalization into a workflow.""" | ||
match run_norm: | ||
case RunNormalization.monitor_histogram: | ||
workflow.insert(normalize_by_monitor_histogram) | ||
case RunNormalization.monitor_integrated: | ||
workflow.insert(normalize_by_monitor_integrated) | ||
case RunNormalization.proton_charge: | ||
workflow.insert(normalize_by_proton_charge) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I lack some context, but I am not convinced by the design choice here. Why do we need an enum which then decides which providers to insert? Wouldn't it be more natural and a more modular design if we had a small normalization workflow (or 3 actually) which can be passed to (dependency injected into) the main workflow creation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be great. But it requires scipp/sciline#180
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could do this, would you require the users to know where to insert the workflow? I.e.
wf = DreamWorkflow()
norm = MonitorHistogramNormalization()
wf[NormalizedRunData[RunType]] = norm
Or do we still need a helper function?
wf = DreamWorkflow()
norm = MonitorHistogramNormalization()
insert_run_normalization(wf, norm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither:
wf = DreamWorkflow(normalization_workflow=MonitorHistogramNormalization())
or something like that.
We should look into scipp/sciline#180 then, before adding workarounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As upstream changes ran into trouble, I suggest we keep this as it is.
149203c
to
6220922
Compare
Fixes #76