Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental: RAG: SQL database content retriever #1056

Merged
merged 13 commits into from
May 21, 2024

Conversation

langchain4j
Copy link
Owner

@langchain4j langchain4j commented May 6, 2024

Issue

#232

Change

An experimental SqlDatabaseContentRetriever has been added.

Simplest usage example:

ContentRetriever contentRetriever = SqlDatabaseContentRetriever.builder()
    .dataSource(dataSource)
    .chatLanguageModel(openAiChatModel)
    .build();

In this case SQL dialect and table structure will be determined from the DataSource.

But it can be customized:

ContentRetriever contentRetriever = SqlDatabaseContentRetriever.builder()
    .dataSource(dataSource)
    .sqlDialect("PostgreSQL")
    .databaseStructure(...)
    .promptTemplate(...)
    .chatLanguageModel(openAiChatModel)
    .maxRetries(2)
    .build();

See SqlDatabaseContentRetrieverIT for a full example.

General checklist

  • There are no breaking changes
  • I have added unit and integration tests for my change
  • I have manually run all the unit and integration tests in the module I have added/changed, and they are all green
  • I have manually run all the unit and integration tests in the core and main modules, and they are all green

Checklist for adding new model integration

  • I have added my new module in the BOM

@langchain4j
Copy link
Owner Author

cc @dandreadis @geoand

@geoand
Copy link
Contributor

geoand commented May 9, 2024

Also cc @jmartisk

@jmartisk
Copy link
Contributor

jmartisk commented May 9, 2024

Looks cool, we have something like it in the Quarkus samples, but this looks more general
Did you try with various models? I remember trying to use codellama and mistral for the sample that we have, but it just didn't work properly (only GPT produced reasonable queries).

@langchain4j
Copy link
Owner Author

@jmartisk yes, I played with gpt-3.5-turbo, open-mistral-7b (via Mistral API) and tinydolphin (via Ollama).
For those integration tests that I have, gpt-3.5-turbo works most of the time, open-mistral-7b in like 70% cases and tinydolphin only with simple cases. I can imagine it will work much better with https://ollama.com/library/sqlcoder or other text-to-SQL models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants