Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider FieldSet.from_xarray_dataset() #1940

Open
Tracked by #1844
VeckoTheGecko opened this issue Mar 18, 2025 · 3 comments
Open
Tracked by #1844

Reconsider FieldSet.from_xarray_dataset() #1940

VeckoTheGecko opened this issue Mar 18, 2025 · 3 comments

Comments

@VeckoTheGecko
Copy link
Contributor

VeckoTheGecko commented Mar 18, 2025

FieldSet.from_xarray_dataset() fundamentally works off of the assumption that a single dataset contains all field information, which may not be correct as the field information can be scattered across multiple files which have different dimensions.

I'm not sure how useful this abstraction is as a method due to this difference. May be worth considering removing outright, or just clearly documenting that its limited in scope.

Removal can be done at a later stage (removing this method now will just interfere with test cases that currently use it)

@VeckoTheGecko VeckoTheGecko changed the title FieldSet.from_xarray_dataset() Reconsider FieldSet.from_xarray_dataset() Mar 18, 2025
@fluidnumerics-joe
Copy link

With us moving towards xarray/uxarray adoption, this probably won't be needed.

This being said, the assumption that a xarray.Dataset containing all field information may not be such a bad assumption. As I understand it, and xarray.Dataset is comprised of one or more xarray.DataArray's with each xarray.DataArray representing a field with coordinates and dimensions. open_mfdataset can load data arrays from multiple files and combine them into a single data set.

@VeckoTheGecko
Copy link
Contributor Author

I see. I hadn't used open_mfdataset for combining along non-time dimensions, so I'm not entirely sure how it would work when U and V are stored in separate files. I think that you're right, we can investigate what this would look like v4

@fluidnumerics-joe
Copy link

Documentation suggests that open_mfdataset (by default) will combine the datasets all the files into a single dataset using combine_by_coords. This bit from combine_by_coords docs I think sums up the behavior nicely

Attempt to auto-magically combine the given datasets (or data arrays) into one by using dimension coordinates.

This function attempts to combine a group of datasets along any number of dimensions into a single entity by inspecting coords and metadata and using a combination of concat and merge.

Will attempt to order the datasets such that the values in their dimension coordinates are monotonic along all dimensions.

In the example I wrote (under the "Re-working our example" header), the files that @danliba provided have u, v, and w stored in separate files and this works just fine

import uxarray as ux

grid_path="./data/channel_lizarbe/fesom.mesh.diag.nc"
data_path=["./data/channel_lizarbe/u.fesom.2005_cut.nc",
           "./data/channel_lizarbe/v.fesom.2005_cut.nc",
           "./data/channel_lizarbe/w.fesom.2005_cut.nc"]

uxds = ux.open_mfdataset(grid_path,data_path)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

2 participants