Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autotest: Add a VSIFile wrapper class #8222

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dbaston
Copy link
Member

@dbaston dbaston commented Aug 14, 2023

What does this PR do?

This PR adds a gdal.vsi_open() function that returns a file-like object, allowing virtual files to be used as "regular" Python files in some cases. For example, a zipped CSV can be read using the Python csv module:

with gdal.vsi_open(f'/vsizip/prime_meridian.csv') as f:
    records = [x for x in csv.DictReader(f)]

assert records[2]['INFORMATION_SOURCE'] == "Institut Geographique National (IGN), Paris"

Still need to correctly handle different linebreak/OS combinations, ensure errors are raised when accessing a closed file, handle append mode (or raise error), and probably other things.

What are related issues/pull requests?

Tasklist

  • Add test case(s)
  • Add documentation
  • Updated Python API documentation (swig/include/python/docs/)
  • Review
  • Adjust for comments
  • All CI builds and checks have passed

@coveralls
Copy link
Collaborator

Coverage Status

coverage: 67.069% (-0.002%) from 67.071% when pulling d05da7b on dbaston:python-vsifile-class into 949ac93 on OSGeo:master.

self._binary = "b" in mode
self._encoding = encoding

self._fp = VSIFOpenL(self._path, self._mode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably throw an exception is None is returned

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, also switched to VSIFOpenExL to capture the error message

return self

def __next__(self):
chunk_size = 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably that CPLReadLineL() could be used to avoid this reimplementatoin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@rcoup
Copy link
Member

rcoup commented Aug 16, 2023

If it's helpful, here's an implementation we have been using at Koordinates for a while, with a set of tests: feel free to crib anything that's useful. I've licensed it as MIT. Any questions most welcome.

https://gist.github.com/rcoup/5f4b49149ca26084eed4dccfd1b12895

  • was written quite a while ago — I would guess it could be simpler today incorporating some stdlib updates in recent Python versions and/or other bugfixes. (eg: not having open() essentially C&P)
  • Inheriting from io.RawIOBase means it can fit under/into the standard python buffered & text IO wrapper classes
  • I've manually removed/replaced/inlined use of a handful of internal library functions & test setup. If there's anything you get stuck on please ask.

@dbaston
Copy link
Member Author

dbaston commented Aug 16, 2023

Fantastic, thank you!

@dbaston
Copy link
Member Author

dbaston commented Apr 17, 2024

I have not forgotten about this. I'm leaning towards moving it to gdaltest to focus on internal use for now.

Copy link

The GDAL project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 28 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular link
    to any issues which this pull request fixes

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in 2 weeks.

@github-actions github-actions bot added the stale label May 16, 2024
@dbaston dbaston force-pushed the python-vsifile-class branch 2 times, most recently from 8532886 to 6a5158c Compare May 21, 2024 13:26
@dbaston dbaston changed the title WIP: Python bindings: Add a VSIFile wrapper class autotest: Add a VSIFile wrapper class May 21, 2024
@github-actions github-actions bot removed the stale label May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants