Add remote-read support to read_nwb#2190
Open
h-mayorquin wants to merge 2 commits into
Open
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #2190 +/- ##
==========================================
+ Coverage 95.18% 95.20% +0.01%
==========================================
Files 30 30
Lines 2991 3003 +12
Branches 444 446 +2
==========================================
+ Hits 2847 2859 +12
Misses 86 86
Partials 58 58
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
pynwb.read_nwbread_nwb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2150.
pynwb.read_nwb(path)now accepts remote URLs (s3://,gs://,abfs://,https://, etc.) in addition to local paths. Anonymous public remote files just work; for credentialed remote access, configure credentials in the environment (AWS profile,GOOGLE_APPLICATION_CREDENTIALS, Azure managed identity, etc.) and fsspec picks them up automatically. The signature stays at one parameter to honor the originalread_nwbdesign from #1974. Power-user scenarios (e.g. forced h5py ROS3 driver or custom S3-compatible endpoints) continue to require dropping toNWBHDF5IOorNWBZarrIOdirectly.To move this forward I have decided not to use the
can_readmethods for dispatch, like I did in #1994 for local file support. What we have here is unverified URL pattern matching: routing based on conventions like the.zarrsuffix and DANDI's/zarr/<uuid>/URL layout, without opening the file. It seems to me thatcan_read's contract is stronger and the matching we have here would not fit well there. Especially the convention regarding DANDI Zarr files, which is very NWB-specific and in my view lives naturally here. I think we can move it there later if we want. There is also a secondary performance cost:can_readworks by opening the file, so using it for dispatch would double the network requests on every remote read, which we are avoiding at the moment.The Zarr test depends on hdmf-zarr PR #348 (a one-line
resolve_reffix for fsspec self-references that I opened upstream), sorequirements-opt.txttemporarily pinshdmf-zarrto that PR's branch via PEP 508 git notation. Once #348 lands in an hdmf-zarr release, revert that line back to plainhdmf-zarr; the in-file comment documents the cleanup step.I am also fixing a pre-existing scheme bug in
NWBHDF5IO.read_nwb's streaming branch: the fsspec filesystem was hard-coded to"http"regardless of URL scheme, sos3://,gs://, andabfs://paths silently failed for non-HTTP backends.Checklist
ruff check . && codespellfrom the source directory.