Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optional max_tokens #4401

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Algorithm5838
Copy link
Contributor

@Algorithm5838 Algorithm5838 commented Mar 27, 2024

Use a checkbox to optionally enable the use of max_tokens instead of having it disabled. This feature is useful for OpenAI models, as well as models from OpenRouter and other platforms.
I've set the default to 2048 for smaller context models (4k); however, 4096 is the preferred setting for newer models from OpenAI and Anthropic. Despite these models supporting much larger contexts, their output is capped at 4096.

Copy link

vercel bot commented Mar 27, 2024

@Algorithm5838 is attempting to deploy a commit to the NextChat Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

Your build has completed!

Preview deployment

@H0llyW00dzZ
Copy link
Contributor

Use a checkbox to optionally enable the use of max_tokens instead of having it disabled. This feature is useful for OpenAI models, as well as models from OpenRouter and other platforms. I've set the default to 2048 for smaller context models (4k); however, 4096 is the preferred setting for newer models from OpenAI and Anthropic. Despite these models supporting much larger contexts, their output is capped at 4096.

@Algorithm5838 Just letting you know, there is a bug related to the attach messages feature due to the max_tokens setting in this chat.ts file.
The logic needs to be refactored because the way attach messages work is not consistent, depending on the max_tokens value.

@H0llyW00dzZ
Copy link
Contributor

@Algorithm5838
Copy link
Contributor Author

Algorithm5838 commented Mar 28, 2024

You are correct. I encountered it before and solved it by commenting out this part:

          i >= contextStartIndex;// && tokenCount < maxTokenThreshold;

The issue with the logic is that they assumed max_tokens is input + output, where it is actually output only.
The right way is to include context tokens with the models.

@H0llyW00dzZ
Copy link
Contributor

You are correct. I encountered it before and solved it by commenting out this part:

          i >= contextStartIndex;// && tokenCount < maxTokenThreshold;

The issue with the logic is that they assumed max_tokens is input + output, where it is actually output only. The right way is to include context tokens with the models.

I figured that out a few weeks ago when trying to implement support for anthropic with my friends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants