You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We already know what pages are likely to ask the LLM for help. When those pages are accessed by the user we should preload the model with a spawn (so that we don't shoot ourself in the foot by this).
There is little harm by doing this when the model is already loaded but a big improvement when it isn't.
The text was updated successfully, but these errors were encountered:
When Ollama has to load the model it takes longer for it to answer our requests. We can force it to preload the model by a simple empty request:
curl http://localhost:11434/api/generate -d '{"model": "llava:13b"}'
We already know what pages are likely to ask the LLM for help. When those pages are accessed by the user we should preload the model with a spawn (so that we don't shoot ourself in the foot by this).
There is little harm by doing this when the model is already loaded but a big improvement when it isn't.
The text was updated successfully, but these errors were encountered: