Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: rmv s3fs dependency #548

Closed
2 tasks done
bendichter opened this issue Jan 8, 2025 · 3 comments · Fixed by #549
Closed
2 tasks done

[Feature]: rmv s3fs dependency #548

bendichter opened this issue Jan 8, 2025 · 3 comments · Fixed by #549
Assignees
Labels
category: enhancement improvements of code or code behavior

Comments

@bendichter
Copy link
Contributor

What would you like to see added to the NWBInspector?

Currently, there is an s3fs dependency. While not directly imported, it serves as the backend for fsspec when handling S3 URLs. Specifically, in src/nwbinspector/tools/_read_nwbfile.py, when reading files with "s3://" paths, the code uses fsspec.filesystem("s3", anon=True) which internally uses s3fs to handle the S3 file access.

The problem is that this causes a dependency chain that makes it difficult to use the latest packages. I think with the recent shift to using remfile, we should not need to maintain this apsect of the codebase. Do we want to be using fsspec at all? Are we even using the functions defined in _read_nwbfile.py at all? If not, let's refactor and simplify our dependencies for better environment management.

Do you have any interest in helping implement the feature?

No.

Code of Conduct

@bendichter bendichter added the category: enhancement improvements of code or code behavior label Jan 8, 2025
@stephprince
Copy link
Contributor

I think with the recent shift to using remfile, we should not need to maintain this apsect of the codebase. Do we want to be using fsspec at all? Are we even using the functions defined in _read_nwbfile.py at all?

I saw the linked PR, but just wanted to discuss here more generally how we want to be using fsspec / remfile and the functions defined in _read_nwbfile.py.

Currently, users wanting to stream data can use inspect_dandiset, inspect_dandi_file_path, or inspect_url, which all use remfile. Otherwise, users are directed to use inspect_all and inspect_nwbfile functions. These functions use read_nwbfile from _read_nwbfile.py to access the nwbfile. As you mentioned above, read_nwbfile has the option to use fsspec if an http:// or s3:// path are provided. However, I don't think read_nwbfile's fsspec option is really currently accessible when being called from inspect_all and inspect_nwbfile, and I think users should use inspect_url if they want the streaming feature anyway.

From what I understood from other discussions, read_nwbfile was meant to be eventually be transferred to pynwb. The pynwb read_nwb implementation is in progress, but not finished yet. I think once we complete the pynwb implementation (including support for streaming), we could then update nwbinspector to use the pynwb implementation of read_nwb and remove _read_nwbfile.py entirely. Thoughts?

Regarding s3fs, another option could be to make it an optional dependency until we migrate to the pynwb implementation of read_nwb.

@bendichter
Copy link
Contributor Author

Yes, that all makes sense to me. I wanted to make sure I didn't delete any functionality that is really needed, but it seems like this mode is not really used anymore. We could make it an optional dependency, but since it would be the only dependency in that mode, I think it would be just as easy for users to install it when they need it. As it stands now as a primary dependency it is causing installation issues that are really unnecessary.

@stephprince
Copy link
Contributor

We could make it an optional dependency, but since it would be the only dependency in that mode, I think it would be just as easy for users to install it when they need it.

That makes sense, the error message is pretty clear anyway if someone was using that mode. Sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: enhancement improvements of code or code behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants