You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to have a way that made it easy to provide feedback about the performance of local llm responses.
I think it could be useful for both the developers and users to be able to easily see information about the performance and quality of the local llm in use.
Information that come to mind:
OS & version
Relevant hardware info
CPU
cores
model
Memory
GPU model
Vram
available
in use by khoj
free
chat streaming response time
quality of response (if user rated, i think there is another issue that is specifically this)
Time to do RAG
Time to decode user input
Time to emit output to user (I think chat streaming time logged currently is combination of input decode time and output, not sure if possible to get separate timeing
The text was updated successfully, but these errors were encountered:
Interesting note! These are definitely relevant performance metrics, and thank you for collating this information. We have the /help endpoint in chat which would output some of this, but could definitely be more detailed.
From: https://discord.com/channels/1112065956647284756/1112066421577482262/1166885403798798427
I would like to have a way that made it easy to provide feedback about the performance of local llm responses.
I think it could be useful for both the developers and users to be able to easily see information about the performance and quality of the local llm in use.
Information that come to mind:
The text was updated successfully, but these errors were encountered: