Replace Falcon 🦅 model with Llama V2 🦙 for offline chat #352

sabaimran · 2023-07-27T20:31:34Z

Incoming

Rather than release the first local LLM with Falcon, use Llama with the existing GPT4All setup.
Rework the extract_questions flow. While Falcon's output wasn't high enough quality to actually use this functionality, Llama generally provides good enough responses (at least able to pass ~50% of the benchmark tests). That being said, query times might be too long to justify using this.
Rename all references from Falcon -> Llama
Update relevant unit tests to match Llama capability

Closes #347

- Download from huggingface - Plug in to GPT4All - Update prompts to fit the llama format - Note that this still is not performing extract_questions

…ed on llama format

…tions flow

…xtract questions

…till unreliable

…e-falcon-with-llamav2

debanjum

Nice!! Feels like Llama v2 may just make offline chat useful enough 🤞🏾🏕️

tests/test_gpt4all_chat_actors.py

src/khoj/routers/api.py

src/khoj/routers/helpers.py

- Change llama-specific naming in chat_model methods to be general offline - Fix reasoning of assertion failure in one of the gpt4all actor tests

sabaimran added 6 commits July 26, 2023 22:02

Working example with LlamaV2 running locally on my machine

32f41c8

- Download from huggingface - Plug in to GPT4All - Update prompts to fit the llama format - Note that this still is not performing extract_questions

Add appropriate prompts for extracting questions based on a query bas…

166f8fa

…ed on llama format

Remove run_extraction parameter in extract_questions

b0b551e

Rename Falcon to Llama and make some improvements to the extract_ques…

523eab3

…tions flow

Update tests to use llama methods and reflect llama capabilities

bbc2c73

Do further tuning to extract question prompts and unit tests

563efcc

sabaimran requested a review from debanjum July 27, 2023 20:31

sabaimran added 3 commits July 27, 2023 14:41

Further limit the prompt window size and use a filtering method for e…

77996ee

…xtract questions

Disable extracting questions dynamically from Llama, as results are s…

129683c

…till unreliable

Merge branch 'master' of github.com:khoj-ai/khoj into features/replac…

f4ae153

…e-falcon-with-llamav2

debanjum approved these changes Jul 28, 2023

View reviewed changes

tests/test_gpt4all_chat_actors.py Show resolved Hide resolved

tests/test_gpt4all_chat_actors.py Outdated Show resolved Hide resolved

src/khoj/routers/api.py Outdated Show resolved Hide resolved

src/khoj/routers/helpers.py Outdated Show resolved Hide resolved

sabaimran added 2 commits July 27, 2023 20:36

Minor naming, assert fixes in code

0e552bf

- Change llama-specific naming in chat_model methods to be general offline - Fix reasoning of assertion failure in one of the gpt4all actor tests

Revert to turning gpt4all tests off for CI

fa873ca

sabaimran merged commit 124d97c into master Jul 28, 2023
3 of 4 checks passed

sabaimran deleted the features/replace-falcon-with-llamav2 branch July 28, 2023 03:51

This was referenced Jul 28, 2023

Add offline chat support to Obsidian #357

Closed

Add offline chat support to Emacs #358

Closed

debanjum mentioned this pull request Jul 28, 2023

Configure using Offline Chat from Emacs #361

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Falcon 🦅 model with Llama V2 🦙 for offline chat #352

Replace Falcon 🦅 model with Llama V2 🦙 for offline chat #352

sabaimran commented Jul 27, 2023 •

edited

Loading

debanjum left a comment

Replace Falcon 🦅 model with Llama V2 🦙 for offline chat #352

Replace Falcon 🦅 model with Llama V2 🦙 for offline chat #352

Conversation

sabaimran commented Jul 27, 2023 • edited Loading

Incoming

debanjum left a comment

Choose a reason for hiding this comment

sabaimran commented Jul 27, 2023 •

edited

Loading