Unified per-instrument stream registry + LogContextBinding#936
Draft
SimonHeybrock wants to merge 7 commits into
Draft
Unified per-instrument stream registry + LogContextBinding#936SimonHeybrock wants to merge 7 commits into
SimonHeybrock wants to merge 7 commits into
Conversation
Adds `Stream` / `F144Stream` config records and an `Instrument.streams: dict[str, Stream]` field that replaces the parallel `f144_log_streams` dict + `f144_attribute_registry` field maintained per instrument. The StreamLUT for f144 is now derived from `instrument.streams`; the attribute lookup in detector/reduction/timeseries handlers reads `F144Stream.units` directly. Synthetic `chopper_cascade` handling moves to a `_SYNTHETIC_LOG_SOURCES` whitelist (no per-stream attrs needed beyond the empty-units default). Implements steps 1-5 of `docs/developer/plans/unified-stream-registry.md`. The codegen helper (`nexus_helpers.generate_f144_log_streams_code`) now emits `F144Stream` literals so generated output drops in directly.
Adds `LogContextBinding` (stream_name, workflow_key, dependent_sources)
and `Instrument.log_context_bindings: list[LogContextBinding]`. Bindings
are registered via `Instrument.add_log_context_binding(...)` so that
heavy Sciline-key imports (e.g. `InstrumentAngle[SampleRun]` from
`ess.spectroscopy`) stay out of per-instrument specs.py and are deferred
to `setup_factories`. `Instrument.__post_init__` validates that any
construction-time bindings reference declared streams;
`_validate_binding_dependent_sources` is called at the end of
`load_factories` to ensure each binding's `dependent_sources` matches
some registered spec's source_names.
Bifrost migrated as the demonstration consumer: the hand-coded
`context_keys={'detector_rotation': InstrumentAngle[SampleRun], ...}` in
`_make_cut_stream_processor` is now `instrument.get_context_keys(
'unified_detector')`, derived from bindings registered in
`setup_factories`.
LOKI's dynamic-transforms and any future chopper-workflow consumer can
now adopt the same binding list independently, without serialised
rollout.
Implements steps 6-7 of `docs/developer/plans/unified-stream-registry.md`
for the typed Sciline-key consumer path. The dynamic-transforms path
(LOKI's `TransformValueStream` machinery) is unchanged and will be a
focused follow-up.
Implements step 9 of `docs/developer/plans/unified-stream-registry.md` end-to-end as a proof point. Three pieces: 1. `build_streams(parsed, overrides, synthetics)` composes a final `dict[str, Stream]` from a parsed list, hand-edited overrides (keyed by either `stream_name` or `nexus_path`, applied via `dataclasses.replace`), and additional synthetic streams. Detects stream_name collisions in the parsed list and across synthetics. 2. `nexus_helpers.generate_streams_parsed_module` emits a complete importable Python module (SPDX header, docstring with source filename, `F144Stream` import, sorted list literal). The CLI now supports `--output PATH` to write directly. Multi-topic by default; `--topic` and `--exclude` still filter. 3. LOKI migrated as the demonstration consumer. `loki/streams_parsed.py` is checked-in auto-generated content; `loki/specs.py` imports `PARSED_STREAMS` and runs it through `build_streams`. The resulting `Instrument.streams` is functionally equivalent to before (the only addition is the parsed `nexus_path` field, which carries no behavioural change today but is the wire that LOKI's dynamic-transforms follow-up will tap into). A drift test (`tests/config/streams_parsed_test.py`) regenerates the LOKI module from the pinned geometry file and fails on diff, with a shell command in the failure message to refresh. Other instruments (bifrost, estia, dummy) keep hand-coded entries — their geometry files diverge from current spec naming in ways that need careful overrides, and the mechanism is now proven so those migrations can land on their own timelines.
Generates and checks in ``streams_parsed.py`` for bifrost, estia, loki, nmx, odin, and tbl, sourced from the corresponding ``coda_<inst>_*.hdf`` ground-truth files. Each instrument's ``specs.py`` now composes its registry via ``build_streams(PARSED_STREAMS, ...)``. Where stream names are referenced by other code (Bifrost factory bindings and aux_sources; Estia source_metadata), the parsed entries are renamed via overrides so those references remain intact: * Bifrost: ``detector_rotation`` (from ``detector_tank_angle_r0/value``) and ``sample_rotation`` (from ``114_sample_stack/rotation_stage/value``) * Estia: ``detector_rotation`` (from ``detector_arm/detector_rotation/value``) LOKI's ``streams_parsed.py`` is regenerated from the coda file too, since its 140-entry coda inventory is much fuller than the geometry-file fallback we used in the prior commit. Mechanism improvements landed in the same change: * ``suggest_internal_name`` now filters generic NeXus container groups (``entry``/``instrument``/``sample``/``sample_environment``/ ``transformations``) and joins parent+leaf so colliding leaves like ``phase`` or ``translation1`` carry parent context (e.g. ``005_PulseShapingChopper_phase``, ``wfm1_translation1``). * ``generate_streams_parsed_module`` raises with a per-conflict listing if any two distinct NXlog parents end up with the same suggested name — the user resolves via build_streams overrides. Drift test (``tests/config/streams_parsed_test.py``) is parameterised over instruments and regenerates from the local coda file when present (devcontainer); skipped otherwise. The LOKI-specific check moves to the shared mechanism. Stream/log mappings in ``streams.py`` for ``odin``, ``tbl``, and ``nmx`` are extended to expose the new f144 streams. Net effect: ``Instrument.streams`` now reflects what the file writer actually writes, with stream counts (bifrost 52, estia 38, loki 78, nmx 64, odin 156, tbl 68) replacing the previous hand-coded subset. Auto-registered timeseries workflows expand accordingly, making the dashboard surface the full set of available readbacks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces the hand-maintained
f144_log_streamsdict +f144_attribute_registryper-instrument structures with a singleInstrument.streams: dict[str, Stream], sourced from coda HDF5 files via a checked-instreams_parsed.py+build_streams(...)composition. AddsLogContextBindingfor routing f144 streams into Sciline workflow keys. Full design indocs/developer/plans/unified-stream-registry.md.What's in this PR
Stream/F144Streamrecords replace the parallel structures (specs.py dict,Instrument.f144_attribute_registry,StreamLUT[logs]). Topic/source/units/path live on one record; consumers derive views.LogContextBinding. Declares(stream_name, workflow_key, dependent_sources); replaces hand-codedcontext_keys={'detector_rotation': InstrumentAngle[SampleRun], ...}in factories. Bifrost migrated as the demonstration consumer. LOKI dynamic-transforms and chopper-workflow can adopt the same binding list independently.python -m ess.livedata.nexus_helpers <coda.hdf> --generate --output …emits an importablestreams_parsed.py(SPDX header +from ess.livedata.config import F144Stream+ sortedlist[F144Stream]).build_streams(parsed, overrides=…, synthetics=…)composes the final registry; overrides accept eitherstream_nameornexus_pathkeys.coda_<inst>_*.hdfground-truth file. Overrides preserve the few names referenced elsewhere (Bifrost factory bindings, aux_sources, source_metadata).suggest_internal_nameimprovements plus generator-side collision detection. Filters generic NeXus containers (entry/instrument/sample/sample_environment/transformations); joins parent+leaf sophase/translation1carry parent context. No collisions across 776 entries.Why now
Two motivations from the design doc: dynamic-transforms and chopper-workflow both need a stream→Sciline-key routing layer. Without unification each added structure (attribute registry, log_context_bindings shim, dynamic_transforms dict) drifts from the others. This PR collapses them so the follow-ups (LOKI dynamic-transforms migration; chopper-workflow adopting bindings) become focused diffs.
The expansion of registered streams (~25 → ~456 auto-registered timeseries workflows across instruments) is intended: manual setup was legacy and incomplete; the dashboard now surfaces the full readback set the file writer actually publishes.
Caveats
Test plan