Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Filesystem needs to support async operations #2554

Open
norlandrhagen opened this issue Dec 12, 2024 · 6 comments
Open

TypeError: Filesystem needs to support async operations #2554

norlandrhagen opened this issue Dec 12, 2024 · 6 comments
Labels
bug Potential issues with the zarr-python library

Comments

@norlandrhagen
Copy link

Zarr version

3.0.0b3

Numcodecs version

0.14.1

Python Version

3.12

Operating System

Mac

Installation

pip

Description

I'm having issues with using zarr.open_group for a local Zarr store.

calling zarr.open_group with a full uri:

zg = zarr.open_group('file:///air.zarr')
gives the error: TypeError: Filesystem needs to support async operations.

calling open_group on a relative local path or an s3 path seems to work fine.
zg = zarr.open_group('air.zarr')
zg = zarr.open_group('s3://carbonplan-share/air_temp.zarr')

Comments from @d-v-b in the Zarr Zulip chat:

I'm guessing this is a bug -- RemoteStore is not accurately named, it should be called FSSpecStore, because it basically just wraps fsspec. We require that the fsspec file system wrapped by RemoteStore support async operations, but it seems like the local file-flavored file system does not. Could you open an issue with this reproducer in it?

there are two possible solutions to this, and we should do both:

associate the file:// protocol with LocalStore instead of RemoteStore
Ensure that invoking RemoteStore('file:///path') works properly

Steps to reproduce

# zarr pooch xarray

import zarr 
import xarray as xr 

ds = xr.tutorial.open_dataset('air_temperature')
ds.to_zarr('air.zarr',mode='w')

# ex: file:///Users/../../../air.zarr
filepath_uri = (Path.cwd() / 'air.zarr').as_uri()
zg = zarr.open_group(filepath_uri)

Error: TypeError: Filesystem needs to support async operations.

Traceback:

{
	"name": "TypeError",
	"message": "Filesystem needs to support async operations.",
	"stack": "---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[95], line 12
      9 # ex: file:///Users/../../../air.zarr
     11 filepath_uri = (Path.cwd() / 'air.zarr').as_uri()
---> 12 zg = zarr.open_group(filepath_uri)

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/_compat.py:43, in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f\"{name}={arg}\"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/api/synchronous.py:216, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    199 @_deprecate_positional_args
    200 def open_group(
    201     store: StoreLike | None = None,
   (...)
    213     use_consolidated: bool | str | None = None,
    214 ) -> Group:
    215     return Group(
--> 216         sync(
    217             async_api.open_group(
    218                 store=store,
    219                 mode=mode,
    220                 cache_attrs=cache_attrs,
    221                 synchronizer=synchronizer,
    222                 path=path,
    223                 chunk_store=chunk_store,
    224                 storage_options=storage_options,
    225                 zarr_version=zarr_version,
    226                 zarr_format=zarr_format,
    227                 meta_array=meta_array,
    228                 attributes=attributes,
    229                 use_consolidated=use_consolidated,
    230             )
    231         )
    232     )

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/core/sync.py:141, in sync(coro, loop, timeout)
    138 return_result = next(iter(finished)).result()
    140 if isinstance(return_result, BaseException):
--> 141     raise return_result
    142 else:
    143     return return_result

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/core/sync.py:100, in _runner(coro)
     95 \"\"\"
     96 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     97 exception, the exception will be returned.
     98 \"\"\"
     99 try:
--> 100     return await coro
    101 except Exception as ex:
    102     return ex

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/api/asynchronous.py:721, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    718 if chunk_store is not None:
    719     warnings.warn(\"chunk_store is not yet implemented\", RuntimeWarning, stacklevel=2)
--> 721 store_path = await make_store_path(store, mode=mode, storage_options=storage_options, path=path)
    723 if attributes is None:
    724     attributes = {}

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/storage/common.py:305, in make_store_path(store_like, path, mode, storage_options)
    303 if _is_fsspec_uri(store_like):
    304     used_storage_options = True
--> 305     store = RemoteStore.from_url(
    306         store_like, storage_options=storage_options, read_only=_read_only
    307     )
    308 else:
    309     store = await LocalStore.open(root=Path(store_like), read_only=_read_only)

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/storage/remote.py:176, in RemoteStore.from_url(cls, url, storage_options, read_only, allowed_exceptions)
    172 if \"://\" in path and not path.startswith(\"http\"):
    173     # `not path.startswith(\"http\")` is a special case for the http filesystem (¯\\_(ツ)_/¯)
    174     path = fs._strip_protocol(path)
--> 176 return cls(fs=fs, path=path, read_only=read_only, allowed_exceptions=allowed_exceptions)

File ~/miniforge3/envs/virtualizarr-min-deps/lib/python3.13/site-packages/zarr/storage/remote.py:90, in RemoteStore.__init__(self, fs, read_only, path, allowed_exceptions)
     87 self.allowed_exceptions = allowed_exceptions
     89 if not self.fs.async_impl:
---> 90     raise TypeError(\"Filesystem needs to support async operations.\")
     91 if not self.fs.asynchronous:
     92     warnings.warn(
     93         f\"fs ({fs}) was not created with `asynchronous=True`, this may lead to surprising behavior\",
     94         stacklevel=2,
     95     )

TypeError: Filesystem needs to support async operations."
}

Additional output

No response

@norlandrhagen norlandrhagen added the bug Potential issues with the zarr-python library label Dec 12, 2024
@jhamman
Copy link
Member

jhamman commented Dec 13, 2024

I think this has been fixed upstream -- fsspec/filesystem_spec#1755

@norlandrhagen - would you mind trying with fsspec@main an report back?

@norlandrhagen
Copy link
Author

Shall do! I'll give it a spin.

@norlandrhagen
Copy link
Author

I installed the latest from fsspec:
pip install git+https://github.com/fsspec/filesystem_spec

xr.__version__
'2024.11.0'

zarr.__version__
'3.0.0b3'


fsspec.__version__
'2024.10.0.post24+gc36066c'

and getting the same error: TypeError: Filesystem needs to support async operations. 😕

TypeError                                 Traceback (most recent call last)
Cell In[8], line 3
      1 # ex: file:///Users/../../../air.zarr
      2 filepath_uri = (Path.cwd() / 'air.zarr').as_uri()
----> 3 zg = zarr.open_group(filepath_uri)

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/_compat.py:43, in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f"{name}={arg}"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/api/synchronous.py:216, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    199 @_deprecate_positional_args
    200 def open_group(
    201     store: StoreLike | None = None,
   (...)
    213     use_consolidated: bool | str | None = None,
    214 ) -> Group:
    215     return Group(
--> 216         sync(
    217             async_api.open_group(
    218                 store=store,
    219                 mode=mode,
    220                 cache_attrs=cache_attrs,
    221                 synchronizer=synchronizer,
    222                 path=path,
    223                 chunk_store=chunk_store,
    224                 storage_options=storage_options,
    225                 zarr_version=zarr_version,
    226                 zarr_format=zarr_format,
    227                 meta_array=meta_array,
    228                 attributes=attributes,
    229                 use_consolidated=use_consolidated,
    230             )
    231         )
    232     )

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/core/sync.py:141, in sync(coro, loop, timeout)
    138 return_result = next(iter(finished)).result()
    140 if isinstance(return_result, BaseException):
--> 141     raise return_result
    142 else:
    143     return return_result

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/core/sync.py:100, in _runner(coro)
     95 """
     96 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     97 exception, the exception will be returned.
     98 """
     99 try:
--> 100     return await coro
    101 except Exception as ex:
    102     return ex

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/api/asynchronous.py:721, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, zarr_format, meta_array, attributes, use_consolidated)
    718 if chunk_store is not None:
    719     warnings.warn("chunk_store is not yet implemented", RuntimeWarning, stacklevel=2)
--> 721 store_path = await make_store_path(store, mode=mode, storage_options=storage_options, path=path)
    723 if attributes is None:
    724     attributes = {}

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/storage/common.py:305, in make_store_path(store_like, path, mode, storage_options)
    303 if _is_fsspec_uri(store_like):
    304     used_storage_options = True
--> 305     store = RemoteStore.from_url(
    306         store_like, storage_options=storage_options, read_only=_read_only
    307     )
    308 else:
    309     store = await LocalStore.open(root=Path(store_like), read_only=_read_only)

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/storage/remote.py:176, in RemoteStore.from_url(cls, url, storage_options, read_only, allowed_exceptions)
    172 if "://" in path and not path.startswith("http"):
    173     # `not path.startswith("http")` is a special case for the http filesystem (¯\_(ツ)_/¯)
    174     path = fs._strip_protocol(path)
--> 176 return cls(fs=fs, path=path, read_only=read_only, allowed_exceptions=allowed_exceptions)

File ~/miniforge3/envs/zarr_v3_debug/lib/python3.12/site-packages/zarr/storage/remote.py:90, in RemoteStore.__init__(self, fs, read_only, path, allowed_exceptions)
     87 self.allowed_exceptions = allowed_exceptions
     89 if not self.fs.async_impl:
---> 90     raise TypeError("Filesystem needs to support async operations.")
     91 if not self.fs.asynchronous:
     92     warnings.warn(
     93         f"fs ({fs}) was not created with `asynchronous=True`, this may lead to surprising behavior",
     94         stacklevel=2,
     95     )

TypeError: Filesystem needs to support async operations.

norlandrhagen added a commit to zarr-developers/VirtualiZarr that referenced this issue Dec 16, 2024
@jhamman
Copy link
Member

jhamman commented Dec 18, 2024

@martindurant / @moradology - any idea why we're not getting the async wrapper here?

@moradology
Copy link

Seems familiar... I'm not 100% certain, but it looks to me like the issue for which this draft PR was cut: #2533

@martindurant
Copy link
Member

Funny that this "needs to be async" happens during a sync() call :)

I think checking async_impl and auto-wrapping is the right thing to do here. Re-interpreting the URL isn't ideal unless we can guarantee that fsspec and objstore configs are identical, which for local might be OK but in general is not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

4 participants