Nested HDF5 Data / HEC-RAS

I'm working on development of the [rashdf](https://github.com/fema-ffrd/rashdf) library for reading [HEC-RAS](https://www.hec.usace.army.mil/software/hec-ras/) HDF5 data. A big part of the motivation for development of the library is stochastic hydrologic/hydraulic modeling.

We want to be able to generate Zarr metadata for stochastic HEC-RAS outputs, so that e.g., results for many different stochastic flood simulations from a given RAS model can be opened as a single xarray Dataset. For example, results for 100 different simulations could be concatenated in a new `simulation` dimension, with coordinates being the index number of each simulation. It took me a little while to figure out how to make that happen because RAS HDF5 data is highly nested and doesn't conform to typical conventions.

[The way I implemented it is hacky](https://github.com/fema-ffrd/rashdf/pull/59/files#diff-36d90394e8c6b955e7717caf26378675bde823fc5609365f06c99a45a722afceR1439): 
1. Given an `xr.Dataset` pulled from the HDF file and the path of each child `xr.DataArray` within the HDF file,
2. Get the filters for each DataArray: `filters = SingleHdf5ToZarr._decode_filters(None, hdf_ds)`
3. Get the storage info for each DataArray: `storage_info = SingleHdf5ToZarr._storage_info(None, hdf_ds)`
4. Build out metadata for chunks using `storage_info`
5. "Write" the `xr.Dataset` to a `zarr.MemoryStore` with `compute=False`, to generate the framework of what's needed for the Zarr metadata
6. Read the objects generated by writing to `zarr.MemoryStore` and decode
7. Assemble the `zarr.MemoryStore` objects, `filters`, and `storage_info` into a dictionary and finally return

I suppose my questions are:
* Is there a better way to approach highly nested or otherwise idiosyncratic HDF5 data with Kerchunk?
* Could Kerchunk's `SingleHdf5ToZarr._decode_filters` and `_storage_info` methods be made public?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nested HDF5 Data / HEC-RAS #490

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nested HDF5 Data / HEC-RAS #490

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions