Provide Text generation usage metrics #3461

johnugeorge · 2024-02-20T13:42:29Z

With huggignface runtime, we want to bring support for usage metrics like tokens (completion tokens + prompt tokens) Ref: usage field in OpenAI response https://platform.openai.com/docs/api-reference/making-requests This can be used by the client for calculating throughput(tokens/sec) etc

Validate: In streaming mode, can we get TTFT (Time to first token)?

sivanantha321 · 2024-03-22T08:07:50Z

/assign

oss-prow-bot bot assigned sivanantha321 Mar 22, 2024

sivanantha321 linked a pull request Mar 25, 2024 that will close this issue

Expose Initial LLM metrics #3547

Draft

9 tasks

yuzisun added the kserve/llm label May 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide Text generation usage metrics #3461

Provide Text generation usage metrics #3461

johnugeorge commented Feb 20, 2024 •

edited

sivanantha321 commented Mar 22, 2024

Provide Text generation usage metrics #3461

Provide Text generation usage metrics #3461

Comments

johnugeorge commented Feb 20, 2024 • edited

sivanantha321 commented Mar 22, 2024

johnugeorge commented Feb 20, 2024 •

edited