Commit cd7b5f7

committed

Make tokenizer.cpp CLI tool nicer.

Before this commit, tokenize was a simple CLI tool like this: tokenize MODEL_FILENAME PROMPT [--ids] This simple tool loads the model, takes the prompt, and shows the tokens llama.cpp is interpreting. This changeset makes the tokenize more sophisticated, and more useful for debugging and troubleshooting: tokenize [-m, --model MODEL_FILENAME] [--ids] [--stdin] [--prompt] [-f, --file] [--no-bos] [--log-disable] It also behaves nicer on Windows now, interpreting and rendering Unicode from command line arguments and pipes no matter what code page the user has set on their terminal.

1 parent 557410b commit cd7b5f7Copy full SHA for cd7b5f7

1 file changed

+407

-10

lines changed

examples/tokenize
- tokenize.cpp

1 file changed

+407

-10

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit cd7b5f7

1 file changed

1 file changed

File tree

1 file changed

1 file changed

0 commit comments