Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDAL_DISABLE_READDIR_ON_OPEN default configuration is confusing for remote resources #9443

Open
hobu opened this issue Mar 11, 2024 · 5 comments

Comments

@hobu
Copy link
Contributor

hobu commented Mar 11, 2024

Feature description

What is the case for defaulting to GDAL_DISABLE_READDIR_ON_OPEN=FALSE for remote VSI sources? Users frequently trip up over this feature, it's double-negative nature makes it very difficult to read, and you almost never want it in the use of drivers that are single file types.

I wonder if the option should be renamed to GDAL_ENABLE_READDIR_SCAN_ON_OPEN=TRUE. Do other users have trouble with this option?

Additional context

No response

@rouault
Copy link
Member

rouault commented Mar 11, 2024

What is the case for defaulting to GDAL_DISABLE_READDIR_ON_OPEN=FALSE for remote VSI sources?

This setting is fundamentally a GDAL core mechanism, implemented in the GDALOpenInfo class. Its general default value is NO.
For remote sources, there might be situations where it is beneficial to establish the directory listing to save useless file probing, but only if there are not too many files in the directory, which we cannot anticipate

I agree the double-negative naming is a pain. I've never get used to it myself.

That said the setting is actually 3 valued, not binary:

  • GDAL_DISABLE_READDIR_ON_OPEN=FALSE: asks to issue a ReadDir() at GDALOpenInfo construction
  • GDAL_DISABLE_READDIR_ON_OPEN=TRUE: do not issue a ReadDir() at GDALOpenInfo construction, but each driver will probe individual (generally sidecar) files using Open() or Stat()
  • GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR: do not issue a ReadDir() at GDALOpenInfo construction, but simulates a list of sibling files that is the empty list, so no probing through Open() or Stat() is attempted at all. This is the one you want to use when you know you'll never need a side-car file.

I would say the 3 uses cases are valid

@hobu
Copy link
Contributor Author

hobu commented Mar 11, 2024

This setting is fundamentally a GDAL core mechanism, implemented in the GDALOpenInfo class. Its general default value is NO.

This core GDAL mechanism was first written when people were assuming they were reading data from spinning disks. We almost never do that nowadays.

I think the default should be GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR at the very least. I opened the ticket to discover if there are others with similar feelings. I do know that any cloud-oriented GDAL processing scripts are almost always littered with configuration management stuff to make sure to turn this behavior off, often after being burned by it.

@rouault
Copy link
Member

rouault commented Mar 11, 2024

I agree the computing environment has shifted a bit in the last 20 years :-) and the current situation is not ideal.

I think the default should be GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR at the very least.

I don't think we can do that without a major break in backwards compatibility. That would break a lot of drivers that require or can make use of side car files. For example if users rely on external .ovr overviews.

@mdsumner
Copy link
Contributor

Does this affect scanning a directory within an archive? (vsizip,vsitar, etc) I've assumed it does, and is why I won't set this value as a default even though I think about doing that often ...

@rouault
Copy link
Member

rouault commented May 10, 2024

Does this affect scanning a directory within an archive? (vsizip,vsitar, etc)

yes, it does. It controls whether directory listing is done. For /vsizip/, directory listing should be relatively fast (as the list of files is at the end of the ZIP). For /vsitar/, it will need to seek from file to file, as there is no file index

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants