-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for tools for the ollama provider #662
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really interesting.
Just so it's clear - does this work with the latest Ollama version or do we still need to wait for that feature to land?
...a/deployment/src/main/java/io/quarkiverse/langchain4j/ollama/deployment/OllamaProcessor.java
Outdated
Show resolved
Hide resolved
import io.quarkus.test.QuarkusUnitTest; | ||
|
||
@Disabled("Integration tests that need an ollama server running") | ||
public class ToolsTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally don't write such tests, but instead use Wiremock (see the OpenAI module for tools related tests)
...llama/runtime/src/main/java/io/quarkiverse/langchain4j/ollama/OllamaDefaultToolsHandler.java
Outdated
Show resolved
Hide resolved
return toolSpecifications.stream() | ||
.filter(ts -> ts.name().equals(toolResponse.tool)) | ||
.map(ts -> toToolExecutionRequest(toolResponse, ts)) | ||
.toList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally try hard to avoid lambdas in Quarkus code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why (just curious)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When Quarkus started, the team found that the lambdas had a small (but not zero) impact on memory usage.
Mind you, this on Java 8, so things may have changed substantially since then, but we still try to avoid them unless the alternative is just plain terrible.
...llama/runtime/src/main/java/io/quarkiverse/langchain4j/ollama/OllamaDefaultToolsHandler.java
Outdated
Show resolved
Hide resolved
Yes it works with the latest Ollama version. |
Very nice, I'll give it a try tomorrow |
This is super interesting, but unfortunately it does not work properly :(. The issue seems to be that Ollama does not understand that the tool has been executed and keeps telling us to re-execute it. 1st request:
1st response:
After this the extension properly executed the tool:
Then the following is sent to Ollama:
The response however is now problematic:
As you can see it tells us to execute the tool again... This keep on happening without the sequence ending from the Ollama side. |
* Whether to enable the experimental tools | ||
*/ | ||
@WithDefault("false") | ||
Optional<Boolean> experimentalTools(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we rather just name it tools
and mark it as experimental in a comment? To avoid having to do a breaking change once we don't consider it experimental anymore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would name it enableTools
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will stay experimental till ollama implements tools feature.
Yes the issue with this approach is that the llm should be aware the tool has been executed. We can have a simplified approach where we just trigger one tool without recursivity. OR the tools should always answer with a status for llm. I will try to add some example with the sendPoem. |
…avoid unnecessary multiple calls to same tool
I've updated the prompt + added some tools history in user messages. But didn't find the good way to avoid selecting twice the same tool. Perhaps @langchain4j @jmartisk or @geoand you can help here ? |
By selecting twice, do you mean the tool gets executed twice? |
yes |
We can't really do much here, the LLM is supposed to decide which tools need to get executed as complex workflows may need to have multiple tool invocations. OpenAI handles this seamlessly. |
yes, but it's weird that this one https://github.com/quarkiverse/quarkus-langchain4j/pull/662/files#diff-4cad3d1a7b72dca01c9cf8f6019dfdc9c8949b729fdafe2cbda381631db6f88bR34 seems to work correctly and it is more complex that the send Poem. I think I'm missing the correct inputs/prompt to tell LLM that the action has been executed. |
In that case, I would turn on logging of requests and responses and compare the one that works with the one that does not. |
By the way, I want to clarify if that we can get this to work properly, it's a no brainer for inclusion :) |
New approach: ask llm to create list of tools to execute and then respond with previous result. Needs some modification in core : https://github.com/quarkiverse/quarkus-langchain4j/pull/662/files#diff-2dd3bec40934ad6d175f6f14dad1af0e11c234cf5fec69739a89460d472ab55b I need to check OpenAI broken tests but they don't work on my side. Still in progress, but the main part could be done in langchain4j and then used in ollama models from langchain and quarkus-langchain. |
In order to replace some AI response containing variables with function results I've changed the order of chat memory. WDYT ? Could I change the messages order in the tests ? Or should I keep the existing order and adapt the tools executor part. |
OK here it is -> langchain4j/langchain4j#1353 Then it's a little bit confusing , there is a lot of code that are not taken from langchain. |
A lot of the tools and AI service related code needs to be different in Quarkus in order to account for doing stuff at build time |
yes, it's why i've isolated the new tools stuff in langchain, to be able to use it in quarkus-langchain too. |
🙏 |
I will take another look today |
Thanks @geoand , for the moment , the quarkus part is in pause. I've implemented it in langhain4j and will integrate it later in quarkus. |
Do you have a LangChain4j example I can try your change on? |
|
I was not able to get that test to pass locally |
What was the issue ? |
Assertions failed :) |
|
I should add that is running against my local Ollama server - not a docker container |
Your ollama server seems more funny that mine : "The result of 1 + 1 is... (drumroll please)... 2!" :) Did you tune your ollama server to be more funny after the Quarkus Insight 170 ? I could adjust the checks to be more flexible. But it will be better if I succeed to reproduce and adjust the sequential prompt. Are all the parallel tests working ? |
Nope, just stock Ollama
|
Weird! Could you test with the ollama container. It will be longer if you are on mac ( 30 minutes ), but just to check if it behave the same. Perhaps there is a context on your local ollama server that produce those issues. |
I'm checking to use the new classes into quarkus langchain4j, but it seems a lot of standard code have been duplicated: ChatRequest, Response , OllamaClient( this should stay different but could use a dedicated interface) , Roles, Response, Do you see any issue if I put them back into langchain or if i use interfaces where needed like OllamaClient ? |
I'll try tomorrow.
I think they were duplicated so we could move fast. The best solution would be to have an approach like we do with the Mistral client where an SPI is introduced in LangChain4j and then used in Quarkus to supply the proper client. |
Using the container, I got this:
|
woooo !! could you add the ollama request response logs when there are those errors. During quarkus-langchain integration i've seen the tool definition is not done in the same way in langchain and quarkus-lanchain due to the -parameters and it impact the llm response. and the same test that works in langchain does not work in quarkus-langchain |
Here is the entire interaction:
Nope |
In any case, I would really really like to have a real standalone example so I can run it against an Ollama server |
I've added a simple test: I will update the quarkus PR to use the langchain 0.32.0-SNAPSHOT where there is a simple quarkus test doing the same |
🙏🏼 |
I actually want some code that I can use not a test environment - just a sample ollama application that involved tools that I can run against Ollama manually. |
I will be back on monday.
By that time you can just transform the standalone test in simple main.
Just use the ollama local server not the docker one.
It should do the job.
On monday I will put a quarkus sample app using lanchain4j 0.32.0-snapshot
that you will be able to use.
Le jeu. 27 juin 2024, 12:28, Georgios Andrianakis ***@***.***>
a écrit :
… I've added a simple test:
***@***.***
#diff-fb50d38a48b1c6cc43ff4d358207e38b2b1164b59868bcc5f280f47a9a041586
<langchain4j/langchain4j@638871b#diff-fb50d38a48b1c6cc43ff4d358207e38b2b1164b59868bcc5f280f47a9a041586>
I actually want some code that I can use not a test environment - just a
sample ollama application that involved tools that I can run against Ollama
manually.
The reason I want this is so I can go through the calls and see what's
going on
—
Reply to this email directly, view it on GitHub
<#662 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A477YIX6ZNUNBY7EWDVM6T3ZJPSMVAVCNFSM6AAAAABJCSHS2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJUGMZTKMJTGE>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Proposition for : #305, tested on llama3, does not work yet with other models.
Draft to discuss the proposition.
Based on experimental python and discussion here
It's way to have tools working untill ollama fix will be available.
To discuss if we want this in langchain or quarkus-langchain or both.
@jmartisk @langchain4j @geoand WDYT ?