Skip to content

Centralise dynamic NeXus transforms for f144-driven geometry#923

Draft
SimonHeybrock wants to merge 17 commits into
mainfrom
worktree-issue-922-loki-dynamic-transform
Draft

Centralise dynamic NeXus transforms for f144-driven geometry#923
SimonHeybrock wants to merge 17 commits into
mainfrom
worktree-issue-922-loki-dynamic-transform

Conversation

@SimonHeybrock
Copy link
Copy Markdown
Member

@SimonHeybrock SimonHeybrock commented May 8, 2026

Why

Fixes #922. LOKI's i_of_q workflow crashes at construction against the chopper-aware geometry artifact: the rear-bank depends_on chain walks through /entry/instrument/detector_carriage/value, an empty NXlog placeholder, and essreduce's reject_time_dependent_transform filter fires because no live-stream override is installed.

The narrow fix would have copied DetectorViewFactory's existing per-source dynamic_transforms wiring into _i_of_q_factory. We chose not to. The same pattern was already silently missing from beam_monitor_m4 (the unfinished tail of #780/#782) and would need to be re-applied to every future SANS or monitor variant whose chain walks the same node. The mechanism's failure mode — "if you forget the wiring, the workflow crashes only when CI runs against the new artifact" — does not scale. This PR replaces it with a per-instrument registry and a single helper, so the project's "no silent wrong values" property covers dynamic geometry uniformly.

The fix is artifact-version-independent: the patched chain provider depends on a context key, so Sciline keeps it out of compute(last_static) at construction time. Neither the old length-1 NXlog nor the new length-0 NXlog crashes there. Once the chopper-workflow branch lands, the existing roundtrip test exercises the formerly-failing path automatically.

Design choices

Bindings keyed by NXlog path, not by source name. One NXlog can sit on the depends_on chain of many components — the carriage node drives multiple positions in LOKI today, and more elsewhere tomorrow. Path-keying with an explicit consumer set mirrors the depends_on graph and avoids each component re-declaring the same routing.

One author-declared TransformLog subclass per binding. The original NewType('TransformValueLog', sc.DataArray | None) was a typing-spec misuse (pylint/mypy grumble) and lost the per-binding distinction Sciline needs when multiple dynamic transforms apply in one workflow. Short subclasses of a shared dataclass are grep-able, IDE-visible, type-safe, and give Sciline distinct keys.

Apply patches per component type with a fused single-step provider. Sciline replaces providers by return type, so a patched provider for NeXusTransformationChain[T, SampleRun] cleanly displaces essreduce's generic implementation for that instantiation; multiple component types compose. Fusing the prior two-step TransformValueLog → TransformValue → chain into one provider drops a parallel hierarchy whose intermediate had a single consumer per binding; splitting back is a binding-local refactor if ever needed.

Aux-source merging centralised in Instrument.register_spec. The old design threaded dynamic_transforms=… through every factory and required each callsite to remember it. Centralising the merge means all specs benefit, including those that call register_spec directly (e.g. _i_of_q_factory). Scoping by the binding's consumer set keeps unrelated specs free of phantom aux inputs.

Registry sits on Instrument; helper is a method. Binding-matching is pure data on the instrument; only the workflow.insert(...) step is Sciline-aware. Exposing it as Instrument.apply_dynamic_transforms keeps the registry encapsulated and matches the spirit of register_spec. Callers pass an explicit source_name → component_type mapping, which doubles as documentation of which loaded components are subject to dynamic geometry.

Trust CI, not artifact introspection at runtime. A registry-validator test (per instrument) walks each binding's nxlog_path from every declared consumer in the actual artifact and scans for orphan empty NXlogs. With that in place, the runtime helper does pure registry matching — no file I/O, no h5py walker on the construction hot path. The scippnexus-based chain walker lives only in the test.

Instrument becomes a required argument of DetectorViewFactory and create_monitor_workflow. Every production callsite already had an Instrument in scope; optionality only allowed misuse to compile. With the argument required, InstrumentDetectorSource — which just wrapped the instrument to pass it through data_source — became redundant and is removed; data_source is optional and defaults to instrument-derived geometry. Also tightens the wavelength-mode guard so the instrument-derived path is rejected explicitly rather than failing deep in essreduce. This is a small detour from the issue's scope, but the asymmetry would otherwise have remained as a confusing artifact of the rollout order.

Loud failures everywhere. Forgetting to call the helper: reject_time_dependent_transform raises at construction. Helper called but routing not wired: UnsatisfiedRequirement at construction. Both wired but no f144 message yet: "no samples yet" at first finalize. Registry inconsistent with artifact: CI test fails. Empty NXlog on a consumer's chain that no binding covers: CI test fails. No path silently substitutes a default.

Deferred

  • beam_monitor_m4 wavelength. Its position is represented by a separate empty NXlog (trans_20) rather than the carriage node. Resolving it needs either a make_geometry_nexus.py change so m4 shares the carriage chain (a one-line addition to the binding's consumers) or registering an independent m4-position f144 stream. Tracked in the CI validator's known-orphan list; the test trips the moment the entry is removed.
  • Readiness gate at job start. The closure already raises a clear "no samples yet" error at first finalize. Converting that into an actionable error at job start is orthogonal to the correctness fix and worth its own work.

Test plan

  • CI passes against main.
  • After the chopper-workflow branch lands and registers the new geometry artifact, confirm test_workflow_roundtrip[loki/i_of_q/1] exercises the formerly-failing path under tox.
  • Run LOKI i_of_q in dev mode against the new artifact and confirm the rear-bank workflow produces I(Q) once a detector_carriage f144 message arrives.
  • Drop the detector_carriage f144 producer mid-run and confirm the workflow surfaces the "no samples yet" error rather than producing stale results.

SimonHeybrock and others added 15 commits May 8, 2026 12:12
Replace the per-source `dynamic_transforms` wiring in `DetectorViewFactory`
with a per-instrument registry on `Instrument.dynamic_transforms` that the
new `apply_dynamic_transforms` helper applies to any factory. The helper
walks the depends_on chain in the geometry artifact, intersects against
the registry, and inserts a per-component-type closure that drives matching
NXlog placeholders from f144 streams.

Pre-emptive fix for the LOKI i_of_q workflow which crashes at construction
time against the chopper-aware geometry artifact (issue #922): its
rear-bank depends_on chain walks through `detector_carriage/value`, an
empty NXlog that essreduce's `reject_time_dependent_transform` rejects
because no filter is installed.

Key shifts versus the prior single-transform mechanism:

- Bindings are keyed by `nxlog_path` and carry their own `consumers` set,
  so one binding can cover multiple components sharing a chain. Replaces
  `LOKI_DYNAMIC_TRANSFORMS` (dict keyed by source).
- Each binding's Sciline key is an author-declared `TransformLog` subclass
  (grep-able, type-clean), replacing the awkward
  `NewType('TransformValueLog', sc.DataArray | None)`. SPW gains a tiny
  rule that wraps raw NXlog payloads into the typed subclass before
  `set_context`.
- `Instrument.register_spec` automatically merges the matching
  dynamic-transform aux sources into every spec (scoped per spec source
  list), so factories declare aux only for their own roles.
- A patched `NeXusTransformationChain[T, SampleRun]` provider per loaded
  component type fuses the previous two-step
  `TransformValueLog -> TransformValue -> chain` into one closure with
  concrete annotations.

The single-transform machinery is deleted: `TransformValueStream`,
`TransformValueLog`, `TransformValue`, `TransformName`,
`get_transformation_chain_with_value`, `transform_value_from_log`,
`add_dynamic_transform`, `DetectorROIAuxSources(dynamic_transforms=...)`,
the `DetectorViewFactory(dynamic_transforms=...)` ctor argument, and
`tests/handlers/detector_view/transform_value_test.py`.

A CI registry validator parameterised per instrument confirms each
binding's `nxlog_path` is reachable from every declared consumer in the
artifact and that no orphan empty NXlogs slip through. `beam_monitor_m4`
position remains a known follow-up (its chain still walks an unmanaged
empty NXlog `trans_20`); resolving it needs either a
`make_geometry_nexus.py` change to share the carriage NXlog or a separate
m4 position f144 stream registration. Tracked in
`_KNOWN_ORPHAN_NXLOGS`.

Design captured in `docs/developer/plans/dynamic-nexus-transforms.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apply_dynamic_transforms was duplicating scippnexus's depends_on
walker through ~75 LOC of h5py helpers (_decode, _walk_depends_on,
_component_chain_paths, _scan_for_empty_nxlog), with an unused
_NX_CLASS_BY_TYPE dict and a _component_group_path whose
component_type parameter was never read.

Replace all of it with a single load_depends_on_chain helper that
delegates to scippnexus.parse_depends_on_chain, picking up correct
handling of relative targets, terminal '.', and cycles. The empty-
NXlog check now inspects the loaded chain's transformations directly.

Going through ess.reduce's NeXusTransformationChain provider was
rejected because it requires a full NeXusComponent load first — heavy
and warns on partial artifacts (e.g. detectors without a signal
field), which would surface at workflow-construction time.
Per-source scoping is statically known from the registry — `source_name in
binding.consumers`. Inspecting the artifact at workflow-construction time
duplicated what CI (`tests/config/dynamic_transforms_registry_test.py`)
already verifies: that the registry's `nxlog_path`/`consumers` match the
geometry artifact. Trust CI as the source of truth and use registry-only
matching at runtime; no file I/O on the runtime path.

Moves the scippnexus-based chain walker into the CI test (its sole user),
and drops the now-redundant runtime unmanaged-NXlog test (covered by the
CI orphan scan).
The closure replaces essreduce's ``get_transformation_chain``, so it must
share the same input (NeXusComponent) and output (NeXusTransformationChain)
types — depending on the chain instead would self-cycle on its own return
type. Spell this out in the docstring.
The dynamic-transform aux-source plumbing is cross-cutting, not
detector-view-specific. Move it to its natural home so
`Instrument.register_spec` no longer has to reach upward into
`detector_view_specs`:

- `CombinedAuxSources` -> `config/workflow_spec.py` (next to
  `AuxSources`; pure composition primitive).
- `_DynamicTransformAuxSources` + `compose_aux_sources` ->
  `handlers/dynamic_transforms.py`. Fold the former
  `dynamic_transform_aux_inputs` / `dynamic_transform_routes`
  helpers into the class -- they were only used by it.

`detector_view_specs.py` retains `DetectorROIAuxSources`, which
genuinely is detector-specific.
`apply_dynamic_transforms` previously took `component_types=(NXdetector,)`
and resolved each component's source name via
`workflow.compute(NeXusName[T])`. That single-key lookup cannot
distinguish per-channel monitors (`NeXusMonitorName[Incident]` vs
`NeXusMonitorName[Transmission]`, ...), which matters because some
monitor sources do move and may grow dynamic-transform bindings.

Take `dynamic_keys + aux_source_names` instead: iterate `NeXusData[T, R]`
entries (the natural per-channel granularity) and resolve each alias
through `aux_source_names`. Plain dict lookup, no graph round-trip,
no precondition that the workflow already has `NeXusName[T]` set.

Call sites (LOKI `i_of_q`, detector-view factory) pass through the
existing `aux_source_names` parameter; explicit `wf[NeXusDetectorName] =
source_name` / `wf[NeXusMonitorName[T]] = ...` lines stay where they
are.
The binding-matching logic is pure data on `Instrument`; only the
chain-provider insertion is sciline-aware (and that's a single
`workflow.insert(...)` line). Expose it as a method so factories
don't import the function and the registry stays encapsulated, in
the same spirit as `register_spec` auto-applying `compose_aux_sources`.

The method takes a resolved `source_name -> component_type` mapping
rather than `dynamic_keys + aux_source_names`. Alias resolution is a
factory/workflow-construction concern; `Instrument` shouldn't know
about it. Callers enumerate the components explicitly, which doubles
as documentation of which loaded components are subject to dynamic
geometry.

`_build_patched_chain_provider` stays in `handlers/dynamic_transforms.py`
and is imported lazily by the method to avoid the cycle.
The two checks asserted dataclass-default and that `class A(B): pass`
produces a distinct type — testing Python, not our code. The subclasses
themselves are kept as fixtures for the remaining tests.
Extract the ``exec`` from ``_build_patched_chain_provider`` into a
``_synthesise_provider`` helper whose docstring spells out why code
synthesis is unavoidable (Sciline introspects via
``inspect.getfullargspec`` which reads ``__code__`` and ignores
``__signature__``; ``__annotations__`` alone cannot invent parameters)
and points at the stdlib precedents that use the same technique.

``_build_patched_chain_provider`` now reads as plain Python: define
``_impl``, build the annotations dict, hand off. ``_impl`` takes
``*containers`` so the synthesised wrapper can call it directly without
a tuple-packing dance in the template.
Mirror DetectorViewFactory's pattern: an optional ``instrument`` kwarg
lets create_monitor_workflow apply the instrument's dynamic-transform
bindings to the internal sciline pipeline (typed as
``NeXusTransformationChain[NXmonitor, SampleRun]``) and merge the
resulting Sciline context keys with any caller-supplied ones. With no
matching binding the call is a no-op, so wiring this in at factory
sites is safe ahead of any monitor source being added to a binding.
…tories

Every production call site has an Instrument in scope, so optionality
just allowed misuse to compile. Wire ``instrument=instrument`` through
all instrument factories (bifrost, dream, dummy, estia, loki, nmx, odin,
tbl) and the ``create_monitor_workflow_factory`` wrapper. With no monitor
in any binding's ``dependent_sources`` this is a no-op today; lights up
automatically when a binding for a movable monitor is registered.
Mirrors the same change for ``create_monitor_workflow``. Every production
``DetectorViewFactory(...)`` site has an Instrument in scope and most
already construct ``InstrumentDetectorSource(instrument)`` right next to
the factory; the optionality was an accident of the parent PR's roll-out
order, not a design choice. Drop the ``if self._instrument is not None``
guard inside ``make_workflow`` and wire ``instrument=`` at the dummy,
NMX, DREAM and ``Instrument.load_factories`` call sites plus three test
helpers.
Now that ``DetectorViewFactory`` always receives an ``Instrument``,
wrapping the same instrument in ``InstrumentDetectorSource`` to pass
again via ``data_source`` is redundant. Make ``data_source`` optional;
when omitted, build ``EmptyDetector`` from
``instrument.get_detector_number(source_name)`` — what the wrapper did
anyway. ``NeXusDetectorSource`` stays as the opt-in for file-loaded
geometry (still required for wavelength mode); ``DetectorNumberSource``
remains for tests that want a file-less, instrument-less setup.

Also tightens the wavelength-mode guard: previously only
``DetectorNumberSource`` was rejected; now any data_source that is not
``NeXusDetectorSource`` is, closing a latent gap where the
instrument-derived path (no positions) would have silently failed
deeper in essreduce.
@SimonHeybrock SimonHeybrock force-pushed the worktree-issue-922-loki-dynamic-transform branch from d3595c5 to e279827 Compare May 12, 2026 07:13
The mechanism added on this branch — typed Sciline-key wrappers around
cumulative `ToNXlog` payloads, plus a `StreamProcessorWorkflow` rule that
wraps raw NXlogs in the typed key before `set_context` — is generic, but
the symbols were named and placed as if exclusively for dynamic NeXus
transformations. Comparing this branch to `chopper-workflow` made the
overlap concrete: the chopper workflow re-implements the same
"latest-sample-of-context-NXlog → Sciline parameter" pattern by hand
(an `accumulate()` override on the workflow class, scalar caches, a
closure provider), and the per-spec aux-source routing is reimplemented
as a one-off helper. Both branches duplicate the routing layer and the
typed-key convention; only the chain-patcher itself is transform-specific.

This commit factors the convention into a small, generic layer that
either branch can consume:

- `ValueLog` (frozen dataclass with `values: sc.DataArray | None`) moves
  to `stream_processor_workflow.py`, alongside the wrapping rule that
  consumes it. The name drops the transform-flavoured prefix — the class
  is the typed wrapper for an NXlog's `value`-over-`time` payload, no
  more transform-specific than `ToNXlog` is.

- `synthesise_provider` (was `_synthesise_provider` in
  `dynamic_transforms.py`) moves to `stream_processor_workflow.py` as
  public API. It is pure code-synthesis: build a Sciline provider with
  N named typed positional parameters. The codegen path is unavoidable
  because Sciline introspects providers via `inspect.getfullargspec()`,
  reading `__code__` rather than `__signature__`; producing typed
  positional params therefore requires a real function via
  `exec`/`compile`. Same technique `dataclasses`, `attrs`, `pydantic`,
  and `collections.namedtuple` use.

- A new `handlers/log_context.py` module hosts the generic routing
  layer:
    - `LogContextBinding` — declares "f144 stream X feeds Sciline key K,
      scoped to sources S, with units U". Carries optional `units` so
      `Instrument` can auto-populate `f144_attribute_registry` from
      bindings, removing the manual splice currently required.
    - `LogContextAuxSources` — walks `instrument.log_context_bindings`
      to derive a spec's aux-source set; renders per-job by
      `source_name`. Identical behaviour to the previous
      `_DynamicTransformAuxSources`, generalised over binding kind.
    - `compose_aux_sources` — same.

- `dynamic_transforms.py` shrinks to the transform-specific layer:
    - `DynamicTransformBinding(LogContextBinding)` adds `nxlog_path`.
    - `build_patched_chain_provider` (renamed from the private helper)
      uses imported `synthesise_provider` and `ValueLog`.

- `Instrument.dynamic_transforms` is renamed to `log_context_bindings`
  and re-typed to the generic base. `apply_dynamic_transforms` filters
  via `isinstance(b, DynamicTransformBinding)` so non-transform
  bindings (future chopper bindings, anything else f144-driven that
  needs context delivery without provider patching) coexist on the
  same list. `__post_init__` auto-derives `f144_attribute_registry`
  entries from bindings that declare units; explicit entries still win
  on conflict, so existing instrument configs are unchanged.

The LOKI carriage binding picks up `units='mm'` to demonstrate the
auto-derivation. No behavioural change — the explicit
`f144_log_streams`-derived registry continues to populate the same
entry; the binding's units now make the explicit one redundant rather
than load-bearing.

What this enables on `chopper-workflow` after the rebase: drop the
legacy single-binding transform machinery in
`detector_view/workflow.py` (subsumed here), drop
`_chopper_aux_sources` and `make_chopper_attribute_registry` (the
binding list replaces both), and reshape `_WavelengthLutWorkflow` to
consume per-chopper logs via a `synthesise_provider`-built provider
rather than a Python-side cache fed by an `accumulate()` override.

Refactor only — no behaviour change. Full test suite passes
(3368 / 3419, 51 pre-existing environment skips).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LOKI i_of_q workflow fails to start with new geometry artifact (length-0 NXlogs)

1 participant