Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement datatree_to_icechunk such that the previously added test passes #91

Open
maxrjones opened this issue Feb 24, 2025 · 3 comments
Assignees

Comments

@maxrjones
Copy link
Contributor

No description provided.

@maxrjones
Copy link
Contributor Author

The new functionality for this would go in https://github.com/zarr-developers/VirtualiZarr/blob/main/virtualizarr/writers/icechunk.py. I think it may be possible to implement this before the xarray DataTree <-> Zarr Python 3 issues are resolved if datatree_to_icechunk iterates over the nodes and calls write_variables_to_icechunk_group on each node.

@chuckwondo
Copy link

The new functionality for this would go in https://github.com/zarr-developers/VirtualiZarr/blob/main/virtualizarr/writers/icechunk.py. I think it may be possible to implement this before the xarray DataTree <-> Zarr Python 3 issues are resolved if datatree_to_icechunk iterates over the nodes and calls write_variables_to_icechunk_group on each node.

@maxrjones, when you say "nodes," do you mean datasets? If so, does that mean each time we call write_variables_to_icechunk_group, is the group the absolute path of the dataset within the tree?

@maxrjones
Copy link
Contributor Author

If you look at https://docs.xarray.dev/en/stable/getting-started-guide/quick-overview.html#datatrees I'm referring to what that section calls "groups". While those docs say "you can think of it (a datatree) as a recursive dict of Dataset objects", I think it's important for this issue to recognize that a node/group is not exactly a dataset. You can call DataTree[<node-name>].ds to get a view of the contained dataset to DataTree[<node-name>].to_dataset() to get a copy, but the nodes implement different methods than datasets which is why they are different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants