ValueError: please provide at least one prompt #95

TikaToka · 2025-03-11T03:46:01Z

Hello, thank you for sharing amazing work.

I am trying to evaluate my model with lm_eval --model vllm --model_args pretrained=ckpts/s1-20250310_141828,dtype=bfloat16,tensor_parallel_size=2 --tasks aime25_nofigures --batch_size auto --apply_chat_template --output_path s1.1forcingignore1wait --log_samples --gen_kwargs "max_gen_toks=20000,temperature=0,temperature_thinking=0,max_tokens_thinking=20000,thinking_n_ignore=1,thinking_n_ignore_str=Wait"

However, there is still an error

Processed prompts: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [05:40<00:00, 22.69s/it, est. speed input: 6.14 toks/s, output: 881.59 toks/s]
[rank0]: Traceback (most recent call last):                                                                                                     | 1/15 [05:40<1:19:24, 340.29s/it, est. speed input: 0.80 toks/s, output: 58.77 toks/s]
[rank0]:   File "/home/hslim/miniforge3/envs/s1/bin/lm_eval", line 8, in <module>
[rank0]:     sys.exit(cli_evaluate())
[rank0]:   File "/home/hslim/data/models/s1/eval/lm-evaluation-harness/lm_eval/__main__.py", line 394, in cli_evaluate
[rank0]:     results = evaluator.simple_evaluate(
[rank0]:   File "/home/hslim/data/models/s1/eval/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
[rank0]:     return fn(*args, **kwargs)
[rank0]:   File "/home/hslim/data/models/s1/eval/lm-evaluation-harness/lm_eval/evaluator.py", line 301, in simple_evaluate
[rank0]:     results = evaluate(
[rank0]:   File "/home/hslim/data/models/s1/eval/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
[rank0]:     return fn(*args, **kwargs)
[rank0]:   File "/home/hslim/data/models/s1/eval/lm-evaluation-harness/lm_eval/evaluator.py", line 506, in evaluate
[rank0]:     resps = getattr(lm, reqtype)(cloned_reqs)
[rank0]:   File "/home/hslim/data/models/s1/eval/lm-evaluation-harness/lm_eval/models/vllm_causallms.py", line 576, in generate_until
[rank0]:     cont = self._model_generate(
[rank0]:   File "/home/hslim/data/models/s1/eval/lm-evaluation-harness/lm_eval/models/vllm_causallms.py", line 339, in _model_generate
[rank0]:     outputs_tmp = self.model.generate(
[rank0]:   File "/home/hslim/miniforge3/envs/s1/lib/python3.10/site-packages/vllm/utils.py", line 1063, in inner
[rank0]:     return fn(*args, **kwargs)
[rank0]:   File "/home/hslim/miniforge3/envs/s1/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 378, in generate
[rank0]:     parsed_prompts = self._convert_v1_inputs(
[rank0]:   File "/home/hslim/miniforge3/envs/s1/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 803, in _convert_v1_inputs
[rank0]:     p["content"] for p in parse_and_batch_prompt(prompt_token_ids)
[rank0]:   File "/home/hslim/miniforge3/envs/s1/lib/python3.10/site-packages/vllm/inputs/parse.py", line 43, in parse_and_batch_prompt
[rank0]:     raise ValueError("please provide at least one prompt")
[rank0]: ValueError: please provide at least one prompt

I've tried changing max_gen_toks and max_tokens_thinking but still does not help.

This is solved when I try with different model,
and It work with no wait.
lm_eval --model vllm --model_args pretrained=ckpts/s1-20250310_141828,dtype=bfloat16,tensor_parallel_size=2 --tasks aime24_figures,aime24_nofigures --batch_size auto --output_path dummy --log_samples --gen_kwargs "max_gen_toks=20000" "

However, budget forcing keep makes error.

I've also tried Inference code in README,

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

# Decide on a token limit for thinking; As the model's max tokens is 32768, 32000 usually ensures there is enough space for the model to still answer
MAX_TOKENS_THINKING = 32000 
# Decide how often to ignore end-of-thinking token
NUM_IGNORE = 1

model = LLM(
    "/home/hslim/data/models/s1/ckpts/s1-20250310_141828", # s1 originally gets this prompt wrong but with budget forcing it fixes it
    tensor_parallel_size=2,
)
tok = AutoTokenizer.from_pretrained(
    "/home/hslim/data/models/s1/ckpts/s1-20250310_141828"
)

stop_token_ids = tok("<|im_end|>")["input_ids"]
sampling_params = SamplingParams(
    max_tokens=32768,
    min_tokens=0,
    stop_token_ids=stop_token_ids,
    skip_special_tokens=False,
    temperature=0.0,
)

# For the exact raspberry sample in the paper see
prompts = [
    "How many r in raspberry",
]

for i, p in enumerate(prompts):
    prompt = "<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n<|im_start|>user\n" + p + "<|im_end|>\n<|im_start|>assistant\n"
    stop_token_ids = tok("<|im_start|><|im_end|>")["input_ids"]
    sampling_params = SamplingParams(
        max_tokens=MAX_TOKENS_THINKING,
        min_tokens=0,
        stop_token_ids=stop_token_ids,
        skip_special_tokens=False,
        temperature=0.0,
    )
    prompt += "<|im_start|>think"
    o = model.generate(
        prompt,
        sampling_params=sampling_params
    )
    
    print(o[0].outputs[0].text)
    
    ignore_str = "Wait"
    max_tokens_thinking_tmp = MAX_TOKENS_THINKING
    if max_tokens_thinking_tmp > 0:
        for i in range(NUM_IGNORE): # Num of times to skip stop token
            max_tokens_thinking_tmp -= len(o[0].outputs[0].token_ids)
            prompt += o[0].outputs[0].text + ignore_str
            sampling_params = SamplingParams(
                max_tokens=max_tokens_thinking_tmp,
                min_tokens=1,
                stop_token_ids=stop_token_ids,
                skip_special_tokens=False,
                temperature=0.0,
            )
            o = model.generate(
                prompt,
                sampling_params=sampling_params
            )
    ### Final answer ###
    prompt += o[0].outputs[0].text # You can also append "Final Answer:" here like we do for some evaluations to prevent the model from just continuing to reason in its answer when early exiting
    stop_token_ids = tok("<|im_end|>")["input_ids"]
    sampling_params = SamplingParams(
        max_tokens=32768,
        min_tokens=0,
        stop_token_ids=stop_token_ids,
        skip_special_tokens=False,
        temperature=0.0,
    )
    o = model.generate(
        prompt,
        sampling_params=sampling_params,
    )
    print("With budget forcing:") # You will see that after the "Wait" in the reasoning trace it fixes its answer
    print(prompt + o[0].outputs[0].text)

There is no problem with first output print, but there is error when budget forcing.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1], line 55
     53         max_tokens_thinking_tmp -= len(o[0].outputs[0].token_ids)
     54         prompt += o[0].outputs[0].text + ignore_str
---> 55         sampling_params = SamplingParams(
     56             max_tokens=max_tokens_thinking_tmp,
     57             min_tokens=1,
     58             stop_token_ids=stop_token_ids,
     59             skip_special_tokens=False,
     60             temperature=0.0,
     61         )
     62         o = model.generate(
     63             prompt,
     64             sampling_params=sampling_params
     65         )
     66 ### Final answer ###

File ~/miniforge3/envs/s1/lib/python3.10/site-packages/vllm/sampling_params.py:337, in SamplingParams.__post_init__(self)
    334 if self.stop and not self.include_stop_str_in_output:
    335     self.output_text_buffer_length = max(len(s) for s in self.stop) - 1
--> 337 self._verify_args()
    339 if self.temperature < _SAMPLING_EPS:
    340     # Zero temperature means greedy sampling.
    341     self.top_p = 1.0

File ~/miniforge3/envs/s1/lib/python3.10/site-packages/vllm/sampling_params.py:378, in SamplingParams._verify_args(self)
    375     raise ValueError("min_p must be in [0, 1], got "
    376                      f"{self.min_p}.")
    377 if self.max_tokens is not None and self.max_tokens < 1:
--> 378     raise ValueError(
    379         f"max_tokens must be at least 1, got {self.max_tokens}.")
    380 if self.min_tokens < 0:
    381     raise ValueError(f"min_tokens must be greater than or equal to 0, "
    382                      f"got {self.min_tokens}.")

ValueError: max_tokens must be at least 1, got 0.

How can I solve it? Thank you in advance.

The text was updated successfully, but these errors were encountered:

Muennighoff · 2025-03-16T22:19:07Z

I think this issue may be helpful: #35

lixin2002cn · 2025-04-05T09:27:44Z

I met the same error. Did you deal with it?

Muennighoff · 2025-04-05T13:32:57Z

I think you probably need to decrease max_gen_toks

lixin2002cn · 2025-04-05T14:35:50Z

I think you probably need to decrease max_gen_toks

I tried this, but it didn't work. Also, the error occurs earlier

TikaToka · 2025-04-06T09:26:23Z

I met the same error. Did you deal with it?

I think there seems to be a condition for this to happen. But I do not have a time to inspect for now.

Therefore, I gave up solving this problem in direct, but by re-training the model.

RohollahHS · 2025-04-13T09:17:19Z

same problem

RohollahHS · 2025-04-13T09:30:17Z

I think the error stems from here

If the first part of the condition is True, but the second part is not, then at the end of the for loop, requests_tmp will be assigned an empty list. So, the generation should be ended by a break or something.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: please provide at least one prompt #95

ValueError: please provide at least one prompt #95

TikaToka commented Mar 11, 2025 •

edited

Loading

Muennighoff commented Mar 16, 2025

lixin2002cn commented Apr 5, 2025

Muennighoff commented Apr 5, 2025

lixin2002cn commented Apr 5, 2025

TikaToka commented Apr 6, 2025

RohollahHS commented Apr 13, 2025

RohollahHS commented Apr 13, 2025 •

edited

Loading

ValueError: please provide at least one prompt #95

ValueError: please provide at least one prompt #95

Comments

TikaToka commented Mar 11, 2025 • edited Loading

Muennighoff commented Mar 16, 2025

lixin2002cn commented Apr 5, 2025

Muennighoff commented Apr 5, 2025

lixin2002cn commented Apr 5, 2025

TikaToka commented Apr 6, 2025

RohollahHS commented Apr 13, 2025

RohollahHS commented Apr 13, 2025 • edited Loading

TikaToka commented Mar 11, 2025 •

edited

Loading

RohollahHS commented Apr 13, 2025 •

edited

Loading