Centralise dynamic NeXus transforms for f144-driven geometry#923
Draft
SimonHeybrock wants to merge 17 commits into
Draft
Centralise dynamic NeXus transforms for f144-driven geometry#923SimonHeybrock wants to merge 17 commits into
SimonHeybrock wants to merge 17 commits into
Conversation
Replace the per-source `dynamic_transforms` wiring in `DetectorViewFactory` with a per-instrument registry on `Instrument.dynamic_transforms` that the new `apply_dynamic_transforms` helper applies to any factory. The helper walks the depends_on chain in the geometry artifact, intersects against the registry, and inserts a per-component-type closure that drives matching NXlog placeholders from f144 streams. Pre-emptive fix for the LOKI i_of_q workflow which crashes at construction time against the chopper-aware geometry artifact (issue #922): its rear-bank depends_on chain walks through `detector_carriage/value`, an empty NXlog that essreduce's `reject_time_dependent_transform` rejects because no filter is installed. Key shifts versus the prior single-transform mechanism: - Bindings are keyed by `nxlog_path` and carry their own `consumers` set, so one binding can cover multiple components sharing a chain. Replaces `LOKI_DYNAMIC_TRANSFORMS` (dict keyed by source). - Each binding's Sciline key is an author-declared `TransformLog` subclass (grep-able, type-clean), replacing the awkward `NewType('TransformValueLog', sc.DataArray | None)`. SPW gains a tiny rule that wraps raw NXlog payloads into the typed subclass before `set_context`. - `Instrument.register_spec` automatically merges the matching dynamic-transform aux sources into every spec (scoped per spec source list), so factories declare aux only for their own roles. - A patched `NeXusTransformationChain[T, SampleRun]` provider per loaded component type fuses the previous two-step `TransformValueLog -> TransformValue -> chain` into one closure with concrete annotations. The single-transform machinery is deleted: `TransformValueStream`, `TransformValueLog`, `TransformValue`, `TransformName`, `get_transformation_chain_with_value`, `transform_value_from_log`, `add_dynamic_transform`, `DetectorROIAuxSources(dynamic_transforms=...)`, the `DetectorViewFactory(dynamic_transforms=...)` ctor argument, and `tests/handlers/detector_view/transform_value_test.py`. A CI registry validator parameterised per instrument confirms each binding's `nxlog_path` is reachable from every declared consumer in the artifact and that no orphan empty NXlogs slip through. `beam_monitor_m4` position remains a known follow-up (its chain still walks an unmanaged empty NXlog `trans_20`); resolving it needs either a `make_geometry_nexus.py` change to share the carriage NXlog or a separate m4 position f144 stream registration. Tracked in `_KNOWN_ORPHAN_NXLOGS`. Design captured in `docs/developer/plans/dynamic-nexus-transforms.md`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apply_dynamic_transforms was duplicating scippnexus's depends_on walker through ~75 LOC of h5py helpers (_decode, _walk_depends_on, _component_chain_paths, _scan_for_empty_nxlog), with an unused _NX_CLASS_BY_TYPE dict and a _component_group_path whose component_type parameter was never read. Replace all of it with a single load_depends_on_chain helper that delegates to scippnexus.parse_depends_on_chain, picking up correct handling of relative targets, terminal '.', and cycles. The empty- NXlog check now inspects the loaded chain's transformations directly. Going through ess.reduce's NeXusTransformationChain provider was rejected because it requires a full NeXusComponent load first — heavy and warns on partial artifacts (e.g. detectors without a signal field), which would surface at workflow-construction time.
Per-source scoping is statically known from the registry — `source_name in binding.consumers`. Inspecting the artifact at workflow-construction time duplicated what CI (`tests/config/dynamic_transforms_registry_test.py`) already verifies: that the registry's `nxlog_path`/`consumers` match the geometry artifact. Trust CI as the source of truth and use registry-only matching at runtime; no file I/O on the runtime path. Moves the scippnexus-based chain walker into the CI test (its sole user), and drops the now-redundant runtime unmanaged-NXlog test (covered by the CI orphan scan).
The closure replaces essreduce's ``get_transformation_chain``, so it must share the same input (NeXusComponent) and output (NeXusTransformationChain) types — depending on the chain instead would self-cycle on its own return type. Spell this out in the docstring.
The dynamic-transform aux-source plumbing is cross-cutting, not detector-view-specific. Move it to its natural home so `Instrument.register_spec` no longer has to reach upward into `detector_view_specs`: - `CombinedAuxSources` -> `config/workflow_spec.py` (next to `AuxSources`; pure composition primitive). - `_DynamicTransformAuxSources` + `compose_aux_sources` -> `handlers/dynamic_transforms.py`. Fold the former `dynamic_transform_aux_inputs` / `dynamic_transform_routes` helpers into the class -- they were only used by it. `detector_view_specs.py` retains `DetectorROIAuxSources`, which genuinely is detector-specific.
`apply_dynamic_transforms` previously took `component_types=(NXdetector,)` and resolved each component's source name via `workflow.compute(NeXusName[T])`. That single-key lookup cannot distinguish per-channel monitors (`NeXusMonitorName[Incident]` vs `NeXusMonitorName[Transmission]`, ...), which matters because some monitor sources do move and may grow dynamic-transform bindings. Take `dynamic_keys + aux_source_names` instead: iterate `NeXusData[T, R]` entries (the natural per-channel granularity) and resolve each alias through `aux_source_names`. Plain dict lookup, no graph round-trip, no precondition that the workflow already has `NeXusName[T]` set. Call sites (LOKI `i_of_q`, detector-view factory) pass through the existing `aux_source_names` parameter; explicit `wf[NeXusDetectorName] = source_name` / `wf[NeXusMonitorName[T]] = ...` lines stay where they are.
The binding-matching logic is pure data on `Instrument`; only the chain-provider insertion is sciline-aware (and that's a single `workflow.insert(...)` line). Expose it as a method so factories don't import the function and the registry stays encapsulated, in the same spirit as `register_spec` auto-applying `compose_aux_sources`. The method takes a resolved `source_name -> component_type` mapping rather than `dynamic_keys + aux_source_names`. Alias resolution is a factory/workflow-construction concern; `Instrument` shouldn't know about it. Callers enumerate the components explicitly, which doubles as documentation of which loaded components are subject to dynamic geometry. `_build_patched_chain_provider` stays in `handlers/dynamic_transforms.py` and is imported lazily by the method to avoid the cycle.
The two checks asserted dataclass-default and that `class A(B): pass` produces a distinct type — testing Python, not our code. The subclasses themselves are kept as fixtures for the remaining tests.
Extract the ``exec`` from ``_build_patched_chain_provider`` into a ``_synthesise_provider`` helper whose docstring spells out why code synthesis is unavoidable (Sciline introspects via ``inspect.getfullargspec`` which reads ``__code__`` and ignores ``__signature__``; ``__annotations__`` alone cannot invent parameters) and points at the stdlib precedents that use the same technique. ``_build_patched_chain_provider`` now reads as plain Python: define ``_impl``, build the annotations dict, hand off. ``_impl`` takes ``*containers`` so the synthesised wrapper can call it directly without a tuple-packing dance in the template.
Mirror DetectorViewFactory's pattern: an optional ``instrument`` kwarg lets create_monitor_workflow apply the instrument's dynamic-transform bindings to the internal sciline pipeline (typed as ``NeXusTransformationChain[NXmonitor, SampleRun]``) and merge the resulting Sciline context keys with any caller-supplied ones. With no matching binding the call is a no-op, so wiring this in at factory sites is safe ahead of any monitor source being added to a binding.
…tories Every production call site has an Instrument in scope, so optionality just allowed misuse to compile. Wire ``instrument=instrument`` through all instrument factories (bifrost, dream, dummy, estia, loki, nmx, odin, tbl) and the ``create_monitor_workflow_factory`` wrapper. With no monitor in any binding's ``dependent_sources`` this is a no-op today; lights up automatically when a binding for a movable monitor is registered.
Mirrors the same change for ``create_monitor_workflow``. Every production ``DetectorViewFactory(...)`` site has an Instrument in scope and most already construct ``InstrumentDetectorSource(instrument)`` right next to the factory; the optionality was an accident of the parent PR's roll-out order, not a design choice. Drop the ``if self._instrument is not None`` guard inside ``make_workflow`` and wire ``instrument=`` at the dummy, NMX, DREAM and ``Instrument.load_factories`` call sites plus three test helpers.
Now that ``DetectorViewFactory`` always receives an ``Instrument``, wrapping the same instrument in ``InstrumentDetectorSource`` to pass again via ``data_source`` is redundant. Make ``data_source`` optional; when omitted, build ``EmptyDetector`` from ``instrument.get_detector_number(source_name)`` — what the wrapper did anyway. ``NeXusDetectorSource`` stays as the opt-in for file-loaded geometry (still required for wavelength mode); ``DetectorNumberSource`` remains for tests that want a file-less, instrument-less setup. Also tightens the wavelength-mode guard: previously only ``DetectorNumberSource`` was rejected; now any data_source that is not ``NeXusDetectorSource`` is, closing a latent gap where the instrument-derived path (no positions) would have silently failed deeper in essreduce.
d3595c5 to
e279827
Compare
The mechanism added on this branch — typed Sciline-key wrappers around
cumulative `ToNXlog` payloads, plus a `StreamProcessorWorkflow` rule that
wraps raw NXlogs in the typed key before `set_context` — is generic, but
the symbols were named and placed as if exclusively for dynamic NeXus
transformations. Comparing this branch to `chopper-workflow` made the
overlap concrete: the chopper workflow re-implements the same
"latest-sample-of-context-NXlog → Sciline parameter" pattern by hand
(an `accumulate()` override on the workflow class, scalar caches, a
closure provider), and the per-spec aux-source routing is reimplemented
as a one-off helper. Both branches duplicate the routing layer and the
typed-key convention; only the chain-patcher itself is transform-specific.
This commit factors the convention into a small, generic layer that
either branch can consume:
- `ValueLog` (frozen dataclass with `values: sc.DataArray | None`) moves
to `stream_processor_workflow.py`, alongside the wrapping rule that
consumes it. The name drops the transform-flavoured prefix — the class
is the typed wrapper for an NXlog's `value`-over-`time` payload, no
more transform-specific than `ToNXlog` is.
- `synthesise_provider` (was `_synthesise_provider` in
`dynamic_transforms.py`) moves to `stream_processor_workflow.py` as
public API. It is pure code-synthesis: build a Sciline provider with
N named typed positional parameters. The codegen path is unavoidable
because Sciline introspects providers via `inspect.getfullargspec()`,
reading `__code__` rather than `__signature__`; producing typed
positional params therefore requires a real function via
`exec`/`compile`. Same technique `dataclasses`, `attrs`, `pydantic`,
and `collections.namedtuple` use.
- A new `handlers/log_context.py` module hosts the generic routing
layer:
- `LogContextBinding` — declares "f144 stream X feeds Sciline key K,
scoped to sources S, with units U". Carries optional `units` so
`Instrument` can auto-populate `f144_attribute_registry` from
bindings, removing the manual splice currently required.
- `LogContextAuxSources` — walks `instrument.log_context_bindings`
to derive a spec's aux-source set; renders per-job by
`source_name`. Identical behaviour to the previous
`_DynamicTransformAuxSources`, generalised over binding kind.
- `compose_aux_sources` — same.
- `dynamic_transforms.py` shrinks to the transform-specific layer:
- `DynamicTransformBinding(LogContextBinding)` adds `nxlog_path`.
- `build_patched_chain_provider` (renamed from the private helper)
uses imported `synthesise_provider` and `ValueLog`.
- `Instrument.dynamic_transforms` is renamed to `log_context_bindings`
and re-typed to the generic base. `apply_dynamic_transforms` filters
via `isinstance(b, DynamicTransformBinding)` so non-transform
bindings (future chopper bindings, anything else f144-driven that
needs context delivery without provider patching) coexist on the
same list. `__post_init__` auto-derives `f144_attribute_registry`
entries from bindings that declare units; explicit entries still win
on conflict, so existing instrument configs are unchanged.
The LOKI carriage binding picks up `units='mm'` to demonstrate the
auto-derivation. No behavioural change — the explicit
`f144_log_streams`-derived registry continues to populate the same
entry; the binding's units now make the explicit one redundant rather
than load-bearing.
What this enables on `chopper-workflow` after the rebase: drop the
legacy single-binding transform machinery in
`detector_view/workflow.py` (subsumed here), drop
`_chopper_aux_sources` and `make_chopper_attribute_registry` (the
binding list replaces both), and reshape `_WavelengthLutWorkflow` to
consume per-chopper logs via a `synthesise_provider`-built provider
rather than a Python-side cache fed by an `accumulate()` override.
Refactor only — no behaviour change. Full test suite passes
(3368 / 3419, 51 pre-existing environment skips).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Fixes #922. LOKI's
i_of_qworkflow crashes at construction against the chopper-aware geometry artifact: the rear-bankdepends_onchain walks through/entry/instrument/detector_carriage/value, an empty NXlog placeholder, and essreduce'sreject_time_dependent_transformfilter fires because no live-stream override is installed.The narrow fix would have copied
DetectorViewFactory's existing per-sourcedynamic_transformswiring into_i_of_q_factory. We chose not to. The same pattern was already silently missing frombeam_monitor_m4(the unfinished tail of #780/#782) and would need to be re-applied to every future SANS or monitor variant whose chain walks the same node. The mechanism's failure mode — "if you forget the wiring, the workflow crashes only when CI runs against the new artifact" — does not scale. This PR replaces it with a per-instrument registry and a single helper, so the project's "no silent wrong values" property covers dynamic geometry uniformly.The fix is artifact-version-independent: the patched chain provider depends on a context key, so Sciline keeps it out of
compute(last_static)at construction time. Neither the old length-1 NXlog nor the new length-0 NXlog crashes there. Once the chopper-workflow branch lands, the existing roundtrip test exercises the formerly-failing path automatically.Design choices
Bindings keyed by NXlog path, not by source name. One NXlog can sit on the
depends_onchain of many components — the carriage node drives multiple positions in LOKI today, and more elsewhere tomorrow. Path-keying with an explicit consumer set mirrors thedepends_ongraph and avoids each component re-declaring the same routing.One author-declared
TransformLogsubclass per binding. The originalNewType('TransformValueLog', sc.DataArray | None)was a typing-spec misuse (pylint/mypy grumble) and lost the per-binding distinction Sciline needs when multiple dynamic transforms apply in one workflow. Short subclasses of a shared dataclass are grep-able, IDE-visible, type-safe, and give Sciline distinct keys.Apply patches per component type with a fused single-step provider. Sciline replaces providers by return type, so a patched provider for
NeXusTransformationChain[T, SampleRun]cleanly displaces essreduce's generic implementation for that instantiation; multiple component types compose. Fusing the prior two-stepTransformValueLog → TransformValue → chaininto one provider drops a parallel hierarchy whose intermediate had a single consumer per binding; splitting back is a binding-local refactor if ever needed.Aux-source merging centralised in
Instrument.register_spec. The old design threadeddynamic_transforms=…through every factory and required each callsite to remember it. Centralising the merge means all specs benefit, including those that callregister_specdirectly (e.g._i_of_q_factory). Scoping by the binding's consumer set keeps unrelated specs free of phantom aux inputs.Registry sits on
Instrument; helper is a method. Binding-matching is pure data on the instrument; only theworkflow.insert(...)step is Sciline-aware. Exposing it asInstrument.apply_dynamic_transformskeeps the registry encapsulated and matches the spirit ofregister_spec. Callers pass an explicitsource_name → component_typemapping, which doubles as documentation of which loaded components are subject to dynamic geometry.Trust CI, not artifact introspection at runtime. A registry-validator test (per instrument) walks each binding's
nxlog_pathfrom every declared consumer in the actual artifact and scans for orphan empty NXlogs. With that in place, the runtime helper does pure registry matching — no file I/O, no h5py walker on the construction hot path. The scippnexus-based chain walker lives only in the test.Instrumentbecomes a required argument ofDetectorViewFactoryandcreate_monitor_workflow. Every production callsite already had anInstrumentin scope; optionality only allowed misuse to compile. With the argument required,InstrumentDetectorSource— which just wrapped the instrument to pass it throughdata_source— became redundant and is removed;data_sourceis optional and defaults to instrument-derived geometry. Also tightens the wavelength-mode guard so the instrument-derived path is rejected explicitly rather than failing deep in essreduce. This is a small detour from the issue's scope, but the asymmetry would otherwise have remained as a confusing artifact of the rollout order.Loud failures everywhere. Forgetting to call the helper:
reject_time_dependent_transformraises at construction. Helper called but routing not wired:UnsatisfiedRequirementat construction. Both wired but no f144 message yet: "no samples yet" at firstfinalize. Registry inconsistent with artifact: CI test fails. Empty NXlog on a consumer's chain that no binding covers: CI test fails. No path silently substitutes a default.Deferred
beam_monitor_m4wavelength. Its position is represented by a separate empty NXlog (trans_20) rather than the carriage node. Resolving it needs either amake_geometry_nexus.pychange so m4 shares the carriage chain (a one-line addition to the binding's consumers) or registering an independent m4-position f144 stream. Tracked in the CI validator's known-orphan list; the test trips the moment the entry is removed.finalize. Converting that into an actionable error at job start is orthogonal to the correctness fix and worth its own work.Test plan
main.test_workflow_roundtrip[loki/i_of_q/1]exercises the formerly-failing path undertox.i_of_qin dev mode against the new artifact and confirm the rear-bank workflow produces I(Q) once adetector_carriagef144 message arrives.detector_carriagef144 producer mid-run and confirm the workflow surfaces the "no samples yet" error rather than producing stale results.