Doing Batched Inference Using TokensPrompt #16388

wlm64 · 2025-04-10T04:29:33Z

wlm64
Apr 10, 2025

Does anyone know how to enable batched inference using a tokensprompt as input instead of text?

Trying to declare a list of tokens like with a text input doesn't work.

For example trying:

tp = TokensPrompt({"prompt_token_ids": [[1,2],[1,2]]})

Results in a *** TypeError: '>' not supported between instances of 'list' and 'int' when using:

model.generate(tp, sampling_params)

However passing a list of strings as an input does batched inference correctly.

Thanks!

Answered by DarkLight1337

You should be passing a list of TokensPrompt, not have the list inside one prompt.

DarkLight1337 · 2025-04-18T22:38:36Z

You should be passing a list of TokensPrompt, not have the list inside one prompt.

0 replies