-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for SQLRecordManager #8
Comments
Hi Christian, this is interesting. The "Requirements" section of the corresponding documentation 1 says:
... and lists a few compatible vector stores. Did you already verify it does not work well with CrateDB, and why? With kind regards, Footnotes |
I did try to use the record manager. It works when I use a sqlite table for the record manager to store its metadata but not when I want to use cratedb itself for that purpose. The reason is that the table uses some sql features that the cratedb dialect does not support.
I think we need to distinguish 2 datastores. In one the embeddings are stored and in an additional SQL database the metadata is stored. |
Hi. Thank you for bringing this subsystem of LangChain to our attention, we apparently missed to add support on the first iteration. Given that the corresponding documentation lists the With kind regards, |
Hi again. crate/crate-python#18 explores the situation, but unfortunately, it can't be made work, as there is indeed a blocker:
So, we will probably close this as "wontfix". It doesn't mean it is impossible, but currently, it would stretch the capacity too much. Let us know if you consider this to be an important improvement with a high priority. Otherwise, let's close the issue? |
it would be more of a nice to have. Can't close it for some reason. |
Hi Alex, thanks for clarifying. We can also keep the issue open to track this topic into the future. When corresponding support will be added to CrateDB, we can easily also add support here. However, it is unlikely, because enforcing uniqueness constraints on larger-than-memory data will be a significant performance hog. With kind regards, |
The Record Manager capabilities in LangChain help to deduplicate content, clean-up deleted or mutated source content, etc.: https://python.langchain.com/docs/modules/data_connection/indexing
Adding such capabilities would make it easier to manage embeddings in CrateDB via LangChain.
The text was updated successfully, but these errors were encountered: