Skip to content

Eval failed for SGLang deployed Qwen3-VL-Thinking, separate reasoning enabled #847

@JustinTong0323

Description

@JustinTong0323

NOTE: Same eval commands working for Instruct model.
Eval cmd:

python3 -m lmms_eval --model openai_compatible --model_args "model_version=Qwen/Qwen3-VL-30B-A3B-Thinking" --tasks mmmu_val --batch_size 128 --log_samples --log_samples_suffix "openai_compatible" --output_path ./logs --gen_kwargs "max_new_tokens=4096"

Error msg as:

█████████▊| 897/900 [1:Model Responding: 100%|█████████████████████████████████████████████████▉| 898/900 [1:Model Responding: 100%|█████████████████████████████████████████████████▉| 899/900 [1:Model Responding: 100%|██████████████████████████████████████████████████| 900/900 [1:46:50<00:00,  3.45s/it]2025-10-04 09:38:49 | INFO     | lmms_eval.models.model_utils.gen_metrics:log_metrics:48 - Metric summary - Total time: 6321.931s, Total tokens: 1400994, Avg speed: 221.6 tokens/s
Model Responding: 100%|██████████████████████████████████████████████████| 900/900 [1:46:50<00:00,  7.12s/it]
Postprocessing:   0%|                                         | 0/900 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/root/.python/sglang/lib/python3.10/site-packages/tenacity/__init__.py", line 470, in __call__
    result = fn(*args, **kwargs)
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/api/task.py", line 1470, in process_results
    results = [res.strip() for res in results]
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/api/task.py", line 1470, in <listcomp>
    results = [res.strip() for res in results]
AttributeError: 'NoneType' object has no attribute 'strip'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/__main__.py", line 347, in cli_evaluate
    results, samples = cli_evaluate_single(args)
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/__main__.py", line 474, in cli_evaluate_single
    results = evaluator.simple_evaluate(
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/utils.py", line 533, in _wrapper
    return fn(*args, **kwargs)
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/evaluator.py", line 268, in simple_evaluate
    results = evaluate(
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/utils.py", line 533, in _wrapper
    return fn(*args, **kwargs)
  File "/root/.python/sglang/lib/python3.10/site-packages/lmms_eval/evaluator.py", line 555, in evaluate
    metrics = task.process_results(doc, [req.filtered_resps[filter_key] for req in requests])
  File "/root/.python/sglang/lib/python3.10/site-packages/tenacity/__init__.py", line 330, in wrapped_f
    return self(f, *args, **kw)
  File "/root/.python/sglang/lib/python3.10/site-packages/tenacity/__init__.py", line 467, in __call__
    do = self.iter(retry_state=retry_state)
  File "/root/.python/sglang/lib/python3.10/site-packages/tenacity/__init__.py", line 368, in iter
    result = action(retry_state)
  File "/root/.python/sglang/lib/python3.10/site-packages/tenacity/__init__.py", line 411, in exc_check
    raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7f19595cbc40 state=finished raised AttributeError>]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions