Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SqlAlchemy handling of firebird stream blobs (a.k.a. BlobReader objects) #58

Open
bryancole opened this issue Oct 10, 2023 · 6 comments

Comments

@bryancole
Copy link

I'm having a problem reading large blobs from a firebird database, using the new firebird-driver.

For large blobs, the firebird driver (both new and old) returns BlobReader objects, rather than the fully materialised python bytes-objects. These BlobReader objects are file-like and can be read to obtain the binary data. The thing that has changed in the new firebird driver is that the Cursor.close() method now closes all BlobReader objects associated with that cursor. Unfortunately, when sqlalchemy executes a statement returning data (i.e. calls fetchXXX() on the cursor), it always closes the cursor after iterating over it, but before accessing any of the data in the returned rows. Hence, later on when the data is passed to the Dialect TypeDecorator to handle type-conversions, the cursor has been closed so all the BlobReader objects are closed, so reading the data fails.

Although I can't see a way to fix this in the Dialect, I'm wondering if you have any ideas, before I look at modifying the sqlalchemy core to add some sort of hook to customise Cursor closing behaviour.
The author of the new firebird driver seems adamant that the BlobReaders ought to be closed when the Cursor is closed.

@pauldex
Copy link
Owner

pauldex commented Oct 12, 2023 via email

@bryancole
Copy link
Author

Thanks for your response. Yes, that's right ( fdb driver doesn't close BlobReaders on Cursor.close(), firebird-driver does).

Sadly, calling .all() or .first() doesn't help. These calls retrieve all data for the query by iterating over the cursor and collecting the data into a list. This happens in the CursorFetchStrategy class (see https://github.com/sqlalchemy/sqlalchemy/blob/a342b3d503f968bbf43f3b2de1f4f623b03a6310/lib/sqlalchemy/engine/cursor.py#L1129C1-L1139C59 ). The cursor is closed immediately after the DB-API2 returns the data as a list (which BlobReader objects in the list, say). Data processing happens later.

This is giving me an idea, that maybe I can subclass the CursorFetchStrategy object to avoid closing the cursor at the end of the fetchXXX() calls. However, I'm not sure how to close the cursor safely after processing is complete. I'll think about this some more.

I have worked around the problem in the short term by setting the stream_blob_threshold config for the driver to be some very large value, such that BlobReaders are never returned. This is OK for my application for the time being.

@fdcastel
Copy link
Contributor

@bryancole did you make any progress about this issue?

I cannot work on this now, but I think is appropriate to cite this comment from SQLAchemy maintainer.

@fdcastel
Copy link
Contributor

fdcastel commented Dec 24, 2023

Related discussion from firebird-driver maintainer.

@bryancole
Copy link
Author

bryancole commented Dec 29, 2023 via email

@pauldex
Copy link
Owner

pauldex commented Jan 6, 2024

@bryancole, should this issue remain open?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants