Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Incorrect Token Cost Calculations #125

Open
ibehnam opened this issue Jun 1, 2024 · 4 comments
Open

[BUG] Incorrect Token Cost Calculations #125

ibehnam opened this issue Jun 1, 2024 · 4 comments
Labels
bug Something isn't working low priority If you want it changed, submit a PR

Comments

@ibehnam
Copy link

ibehnam commented Jun 1, 2024

I noticed the following when adding new models to Koala:

  1. It's not easy to enter decimal numbers! Open-Router shows the costs per 1 Million tokens, while Koala asks for cost per 1 K tokens. So a $3/Million costs (e.g., for the Cohere Command R+ model) should be entered as $0.003. But . is not accepted in the text field. The workaround is to type 3, then manually go to the beginning and enter ., and repeat that to enter 0s.
image
  1. Even after entering the decimal numbers like this, the total cost is not calculated correctly (I checked my usage on Open-Router after 24 hours and saw completely different numbers).
@ibehnam ibehnam changed the title [ BUG] Incorrect Token Cost Calculations [BUG] Incorrect Token Cost Calculations Jun 1, 2024
@ibehnam
Copy link
Author

ibehnam commented Jun 1, 2024

It seems the total costs are wrong by a factor of 6:

I reset the costs and created a new chat where I asked the model to generate a short story. The costs are almost 1/6 the total costs shown in the app settings:

Screenshot 2024-06-01 at 3 53 12 PM

@jackschedel
Copy link
Owner

Do you know if the model you're using uses OpenAI's tokenizer or something else? The token count is just using OpenAI's

@ibehnam
Copy link
Author

ibehnam commented Jun 2, 2024

Ahh, that explains it. I'm using Cohere's Command R+, which has a different tokenizer. It's probably easier to get the token usage info directly from the response:

{"id":"gen-...","model":"openai/gpt-3.5-turbo","object":"chat.completion","created":1717306340,"choices":[{"index":0,"message":{"role":"assistant","content":"The meaning of life is a deeply personal and philosophical question that has been asked by humans for centuries. There are many different beliefs and theories about the meaning of life, and it ultimately depends on an individual's own values, beliefs, and experiences. Some people find meaning in religion or spirituality, others in relationships and connections with others, and some in personal growth and self-discovery. Ultimately, the meaning of life is what each individual makes of it, and it may vary greatly from person to person."},"finish_reason":"stop"}],"system_fingerprint":null,

**"usage":{"prompt_tokens":14,"completion_tokens":100,"total_tokens":114}}**

I don't know TS, otherwise I'd open a PR myself :)

@jackschedel
Copy link
Owner

I suppose that sum cost could be automatically determined behind the scenes based on the response, but the estimate per chat would never be right, unless we had hardcoded logic for different model names.

This would be really annoying to implement since we'd somehow need to have different behavior per model, even though they can be custom defined. The scope of this fix just massively increased, so idk if I'll ever implement it myself.

@jackschedel jackschedel added bug Something isn't working low priority If you want it changed, submit a PR labels Jun 2, 2024
jackschedel added a commit that referenced this issue Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working low priority If you want it changed, submit a PR
Projects
None yet
Development

No branches or pull requests

2 participants