-
Notifications
You must be signed in to change notification settings - Fork 28
SSEP‐3 ‐ Data Containers
SSEP | 3 |
---|---|
Title | Data Containers |
Author(s) | First Last |
Contact Email | [email protected] |
Date-created | YYYY-MM-DD |
Date-updated | YYYY-MM-SS |
Type | Standard |
Discussion | link to discussion if available |
Status | < Discussiom | Accepted | Rejected > |
Spectral X-ray data used in solar and heliospheric science come in a wide variety of shapes, formats, and dimensions, depending on the instrument and observational mode. A crucial part of the Sunkit-SPEX architecture is the implementation of standardized, extensible data containers that provide a consistent API for working with spectral data and associated metadata.
As heliophysics moves beyond Earth-centric observations, assumptions about observer location, instrument distance from the source, and projection geometry must be explicitly encoded in data structures. Instruments now produce event lists, 1D spectra (e.g., energy), 2D (e.g., time-energy), 3D (e.g., space-space-energy), and even 4D datasets (space-space-energy-time). Future missions may add additional dimensions such as polarisation.
Moreover, many analyses require transforming or collapsing higher-dimensional spectra into lower dimensions (e.g., summing over time in a 2D spectra to obtain time-integrated 1D spectra), and this must be reflected consistently across both the data and the instrument response
Note
This SSEP addresses spectral data containers. Event lists will be addressed in a separate proposal.
In addition to the spectral measurements, any scientific analysis must also incorporate the spectral response of the instrument. These response functions often come in separate files and diverse formats, and their structure may not directly mirror the data itself. However, any transformation applied to the spectral data (e.g., binning or slicing) must also be applied coherently to the response. For instance, summing adjacent energy bins requires rebinning the corresponding columns in the response matrix—often a non-trivial operation.
A core requirement is that the spectral data and the instrument response share a compatible and coordinated API to ensure joint operations are applied consistently. Both objects must also preserve and expose relevant metadata, including coordinate frames, exposure information, and instrumental context.
This SSEP outlines the high-level design and required features for spectral data containers in Sunkit-SPEX.
Sunkit-SPEX can draw inspiration and leverage design patterns from several existing Python packages:
Two data containers which can be drawn from namely Map and TimeSeries. As the position of the instruments can no longer be fixed and assumed to be roughly earth located the maps coordinate frame and observer location properties provide a possible approach to incorporating this information.
Spectrum1D, SpectrumCollection and SpectrumList are designed to hold exactly the kind of spectral data need to but from a slightly different domain. The differences and some of the potential issues are discussed on this XraySpectrum1D class with loaders now admittedly very old PR. Essentially the difficulties in mapping operations and information from the spectra to the response.
NDCube provides a number of high level object for representing multidimensional data with a particular focus on WCS and GWCS namely NDCube, NDCubeSequence and NDCollection. The similarity between the specutils containers and the NDCub containers is no accident specutils is based on the NDCube objects. Further sunpy Map is in the processing of being refactored to eventually be and NDCube. Recent progress on sliceable metadata makes NDCube
We propose the definition of at least two primary container classes:
- Spectrum A flexible, multi-dimensional container for spectral measurements.
- Based on or inspired by NDCube.
- Must support slicing, indexing, and collapsing operations (e.g., summing over time).
- Encapsulates metadata such as observation time, instrument position, and exposure.
- Should support transformations like rebinning, stacking, or unit conversion.
- SpectrometerResponse
- A container that mirrors the Spectrum API and represents the instrument’s response.
- Supports consistent operations with the Spectrum container (e.g., rebinning, masking).
- Supports different (e.g., RMF+ARF or SRM) when applicable.
- May support chained responses (e.g., detector + optics).
- Should support lazy loading and chunking for large file sizes.
- Should support modelling i.e parameters of the response can be fit from data
Unified API Principles
- Identical indexing/slicing semantics for data and response.
- Coordinate-aware metadata and observer location tracking.
- Compatible with Astropy Units and Quantities, WCS and GWCS
Following NDCube we can define metadata as
class Spectrum(NDCube):
r"""
Spectrum container for data with one spectral axis.
Note that "1D" in this case refers to the fact that there is only one
spectral axis. `Spectrum` can contain "vector 1D spectra" by having the
``flux`` have a shape with dimension greater than 1.
Notes
-----
A stripped down version of `Spectrum1D` from `specutils`.
Parameters
----------
data : `~astropy.units.Quantity`
The data for this spectrum. This can be a simple `~astropy.units.Quantity`,
or an existing `~Spectrum1D` or `~ndcube.NDCube` object.
uncertainty : `~astropy.nddata.NDUncertainty`
Contains uncertainty information along with propagation rules for
spectrum arithmetic. Can take a unit, but if none is given, will use
the unit defined in the flux.
spectral_axis : `~astropy.units.Quantity` or `~specutils.SpectralAxis`
Dispersion information with the same shape as the dimension specified by spectral_dimension
of shape plus one if specifying bin edges.
spectral_dimension : `int` default 0
The dimension of the data which represents the spectral information default to first dimension index 0.
mask : `~numpy.ndarray`-like
Array where values in the flux to be masked are those that
``astype(bool)`` converts to True. (For example, integer arrays are not
masked where they are 0, and masked for any other value.)
meta : dict
Arbitrary container for any user-specific information to be carried
around with the spectrum container object.
------------------------------------------
exposure_time : `~astropy.units.Quantity`
The effective or actual exposure time use to normalise the data
area : `~astropy.units.Quantity`
"""
class Response(NDCube):
r"""
Response container for spectrometer response.
Parameters
----------
redistribution_martix
transmission
effective_area
"""