Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count system prompt tokens #850

Merged
merged 13 commits into from Mar 22, 2024
Merged

Count system prompt tokens #850

merged 13 commits into from Mar 22, 2024

Conversation

mishig25
Copy link
Collaborator

@mishig25 mishig25 commented Feb 20, 2024

On updating model settings:

Screenshot 2024-03-18 at 3 16 28 PM



On updating assitants settings:

image

@mishig25 mishig25 changed the title Count sysmte prompt tokens Count system prompt tokens Feb 20, 2024
@gary149
Copy link
Collaborator

gary149 commented Feb 20, 2024

There're some errors with some models I think 👀

@mishig25
Copy link
Collaborator Author

There're some errors with some models I think 👀

The reason being is that: https://huggingface.co/meta-llama/Llama-2-70b-chat-hf is a gated model. Therefore, the tokenizer code from frontend tries to grab the files but it is gated.

For now, the behaviour is that: if the model is not gated, the token count is shown. If it is gated, nothing appears.

Wwe have two poaaiblities:

  1. Keep the current behaviour where no tokens are shown on gated models
  2. Move the tokens counting code to backend/server so that tokens can be shown on all models.

wdyt ?

@gary149
Copy link
Collaborator

gary149 commented Feb 21, 2024

I think the current behavior is good. But it shouldn't even try to tokenize if the tokenizer is not available (here it throws 100s of console errors).

image

@nsarrazin nsarrazin added enhancement New feature or request front This issue is related to the front-end of the app. back This issue is related to the Svelte backend or the DB labels Feb 22, 2024
@mishig25 mishig25 marked this pull request as ready for review March 18, 2024 15:12
@mishig25 mishig25 marked this pull request as draft March 18, 2024 15:17
@mishig25
Copy link
Collaborator Author

I think the current behavior is good. But it shouldn't even try to tokenize if the tokenizer is not available (here it throws 100s of console errors).

handled it. Only the one error gets thrown in this case

@mishig25 mishig25 marked this pull request as ready for review March 18, 2024 16:41
@mishig25 mishig25 marked this pull request as draft March 18, 2024 16:56
@mishig25 mishig25 marked this pull request as ready for review March 18, 2024 17:02
@gary149
Copy link
Collaborator

gary149 commented Mar 19, 2024

Just don't do it on llama-2? (as its gated it will never work) maybe it could be in the config.

Copy link
Collaborator

@nsarrazin nsarrazin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it works it seems to work well, but I sometimes get console errors when switching between different models in the settings without reloading the page? But if I reload the page the tokenization works again. Not sure what is causing it

@gary149
Copy link
Collaborator

gary149 commented Mar 19, 2024

When it works it seems to work well, but I sometimes get console errors when switching between different models in the settings without reloading the page? But if I reload the page the tokenization works again. Not sure what is causing it

Yes trying to reproduce that too 👀

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work on local models with chat-ui. I would probably set an explicit key in the config for models that are compatible.

This reverts commit 61c0a22.
@mishig25
Copy link
Collaborator Author

Updates:

Added model.tokenizer config. You can supply:

string - which would be used in transfomer.js AutoTokenizer

OR

{
    tokenizerUrl: string;
    tokenizerConfigUrl: string;
}  - which would be used to construct transfomer.js PreTrainedTokenizer

See .env.template file of this PR. Important: when testing locally make sure you have copied model.tokenizer from .env.template of this PR

Fixed reactivity issues mentioned above

here and here

Editing model system prompts

Screen.Recording.2024-03-19.at.12.21.00.PM.mov

Creating assistnat

Screen.Recording.2024-03-19.at.12.08.24.PM.mov

Editing assistnat

Screen.Recording.2024-03-19.at.12.20.18.PM.mov

classNames="absolute bottom-2 right-2"
prompt={systemPrompt}
modelTokenizer={model.tokenizer}
max_new_tokens={model?.parameters?.max_new_tokens}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use truncate value instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled in f47c6bb

Copy link
Collaborator

@nsarrazin nsarrazin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works quite well! 🔥

@gary149 gary149 self-requested a review March 22, 2024 12:39
@gary149 gary149 merged commit 50d8483 into main Mar 22, 2024
3 checks passed
@gary149 gary149 deleted the show_number_of_tokens branch March 22, 2024 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back This issue is related to the Svelte backend or the DB enhancement New feature or request front This issue is related to the front-end of the app.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants