Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while extracting SSP126 temperature & salinity #207

Closed
yadidya-b opened this issue Feb 1, 2022 · 6 comments
Closed

Error while extracting SSP126 temperature & salinity #207

yadidya-b opened this issue Feb 1, 2022 · 6 comments

Comments

@yadidya-b
Copy link

This is my code:

url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(url)
query_tsz = dict(experiment_id=['historical'],
             table_id='Omon',source_id=['CanESM5'],
             variable_id=['thetao','so'],
             member_id = 'r1i1p1f1',)
col_subset = col.search(require_all_on=['experiment_id'], **query_tsz)
print(col_subset.df.groupby('experiment_id')[['member_id', 'variable_id', 'table_id']].nunique())
cat_tsz = col.search(**query_tsz)
cat_tsz.df['experiment_id'].unique()
z_kwargs = {'consolidated': True, 'decode_times':False}
# pass the preprocessing directly
with dask.config.set(**{'array.slicing.split_large_chunks': True}):
    dset_dict_tsz = cat_tsz.to_dataset_dict(zarr_kwargs=z_kwargs,
                                    preprocess=wrapper)

Error:

ValueError: destination buffer too small; expected at least 27456, got 8256

The above exception was the direct cause of the following exception:

...

OSError: 
            Failed to open zarr store.

            *** Arguments passed to xarray.open_zarr() ***:

            - store: <fsspec.mapping.FSMap object at 0x7ff6187465b0>
            - kwargs: {'consolidated': True, 'decode_times': False}

            *** fsspec options used ***:

            - root: cmip6/CMIP6/ScenarioMIP/CCCma/CanESM5/ssp126/r1i1p1f1/Omon/so/gn/v20190429
            - protocol: ('gcs', 'gs')

            ********************************************

@yadidya-b
Copy link
Author

The same code is working for Historical & other SSP scenarios.

@yadidya-b
Copy link
Author

@jbusecke

@jbusecke
Copy link
Owner

jbusecke commented Feb 4, 2022

Hi @Yadidya5, thanks for using cmip6_preprocessing.

I cannot reproduce your example above. It seems to me that the wrapper function is potentially the cause of the issue?

I tried two things:

  1. use combined_preprocessing instead of wrapper:
import dask
import intake
from  cmip6_preprocessing.preprocessing import combined_preprocessing
wrapper = combined_preprocessing

url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(url)
query_tsz = dict(experiment_id=['historical'],
             table_id='Omon',source_id=['CanESM5'],
             variable_id=['thetao','so'],
             member_id = 'r1i1p1f1',)
col_subset = col.search(require_all_on=['experiment_id'], **query_tsz)
print(col_subset.df.groupby('experiment_id')[['member_id', 'variable_id', 'table_id']].nunique())
cat_tsz = col.search(**query_tsz)
cat_tsz.df['experiment_id'].unique()
z_kwargs = {'consolidated': True, 'decode_times':False}
# pass the preprocessing directly
with dask.config.set(**{'array.slicing.split_large_chunks': True}):
    dset_dict_tsz = cat_tsz.to_dataset_dict(zarr_kwargs=z_kwargs,
                                    preprocess=wrapper)
  1. Check that the zarr stores you are accessing are readable (without all the additional logic of cmip6_pp):
import xarray as xr
for store in cat_tsz.df['zstore'].tolist():
    ds_test = xr.open_zarr(store)

Both of these work for me. Could you provide some information on what wrapper contains? Also are you executing this on the pangeo deployment? Or elsewhere?

@yadidya-b
Copy link
Author

yadidya-b commented Feb 4, 2022

Thanks for the reply @jbusecke!

I defined the wrapper function as:

def wrapper(ds):
    ds = ds.copy()
    ds = rename_cmip6(ds)
    ds = promote_empty_dims(ds)
    ds = broadcast_lonlat(ds)
    ds = replace_x_y_nominal_lat_lon(ds)
    return ds

I'm running this code on my local machine.


Oops!

I just rechecked the error and apologies for my mistake in the code I first attached.

The code is running for historical, ssp245, and ssp585 but only failing while trying to extract ssp126.

import dask
import intake
from  cmip6_preprocessing.preprocessing import combined_preprocessing
wrapper = combined_preprocessing

url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(url)
query_tsz = dict(experiment_id=['ssp126'],
             table_id='Omon',source_id=['CanESM5'],
             variable_id=['thetao','so'],
             member_id = 'r1i1p1f1',)
col_subset = col.search(require_all_on=['experiment_id'], **query_tsz)
print(col_subset.df.groupby('experiment_id')[['member_id', 'variable_id', 'table_id']].nunique())
cat_tsz = col.search(**query_tsz)
cat_tsz.df['experiment_id'].unique()
z_kwargs = {'consolidated': True, 'decode_times':False}
# pass the preprocessing directly
with dask.config.set(**{'array.slicing.split_large_chunks': True}):
    dset_dict_tsz = cat_tsz.to_dataset_dict(zarr_kwargs=z_kwargs,
                                    preprocess=wrapper)

After running

import xarray as xr
for store in cat_tsz.df['zstore'].tolist():
    ds_test = xr.open_zarr(store)

The error is:


ValueError: cannot reshape array of size 3432 into shape (1032,)

@jbusecke
Copy link
Owner

Sorry for the delay. I have just confirmed this error. Since this is not related to cmip6_preprocessing (as we have confirmed by opening the 'raw' store), this cannot be fixed here. I just opened an issue here. Please note that I dont think this problem will be fixed immediately. We are working on quite major changes over there. In the meantime, I would suggest you modify the resulting catalog (and remove the stores that fail to open). sorry for the inconvenience.

@yadidya-b
Copy link
Author

Thank you for the clarification @jbusecke! I understand that it may take a while for this issue to be resolved but hats off to all the amazing work that your team is up to. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants