Skip to content

Commit

Permalink
convert-llama2c-to-ggml : enable conversion of GQA models (ggerganov#…
Browse files Browse the repository at this point in the history
…6237)

* convert-llama2c-to-ggml: enable conversion of multiqueries, ggerganov#5608

* add test in build action

* Update build.yml

* Update build.yml

* Update build.yml

* gg patch
  • Loading branch information
fraxy-v authored and hodlen committed Apr 3, 2024
1 parent 6d9e216 commit 2cd14a4
Show file tree
Hide file tree
Showing 3 changed files with 194 additions and 208 deletions.
11 changes: 11 additions & 0 deletions .github/workflows/build.yml
Expand Up @@ -225,6 +225,17 @@ jobs:
cd build
ctest -L main --verbose --timeout 900
- name: Test llama2c conversion
id: llama2c_test
run: |
cd build
echo "Fetch tokenizer"
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/tok512.bin
echo "Fetch llama2c model"
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories260K/stories260K.bin
./bin/convert-llama2c-to-ggml --copy-vocab-from-model ./tok512.bin --llama2c-model stories260K.bin --llama2c-output-model stories260K.gguf
./bin/main -m stories260K.gguf -p "One day, Lily met a Shoggoth" -n 500 -c 256
# ubuntu-latest-cmake-sanitizer:
# runs-on: ubuntu-latest
#
Expand Down
2 changes: 2 additions & 0 deletions examples/convert-llama2c-to-ggml/README.md
Expand Up @@ -21,6 +21,8 @@ An example command using a model from [karpathy/tinyllamas](https://huggingface.

`$ ./convert-llama2c-to-ggml --copy-vocab-from-model llama-2-7b-chat.gguf.q2_K.bin --llama2c-model stories42M.bin --llama2c-output-model stories42M.gguf.bin`

Note: The vocabulary for `stories260K.bin` should be its own tokenizer `tok512.bin` found in [karpathy/tinyllamas/stories260K](https://huggingface.co/karpathy/tinyllamas/tree/main/stories260K).

Now you can use the model with a command like:

`$ ./main -m stories42M.gguf.bin -p "One day, Lily met a Shoggoth" -n 500 -c 256`

0 comments on commit 2cd14a4

Please sign in to comment.