Qwen2.5-VL failing #192

tmoroney · 2025-01-29T23:53:37Z

I am getting this error when using the new Qwen2.5-VL models.

Command:

python -m mlx_vlm.generate --model mlx-community/Qwen2.5-VL-3B-Instruct-3bit --max-tokens 100 --temp 0.0 --prompt "Describe this image." --image "frames/frame-01.jpg"

Fetching 11 files: 100%|████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 55056.50it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/moroneyt/Documents/VLM-Testing/.venv/lib/python3.12/site-packages/mlx_vlm/generate.py", line 156, in <module>
    main()
  File "/Users/moroneyt/Documents/VLM-Testing/.venv/lib/python3.12/site-packages/mlx_vlm/generate.py", line 101, in main
    prompt = apply_chat_template(processor, config, prompt, num_images=len(args.image))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/moroneyt/Documents/VLM-Testing/.venv/lib/python3.12/site-packages/mlx_vlm/prompt_utils.py", line 159, in apply_chat_template
    return processor.apply_chat_template(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/moroneyt/Documents/VLM-Testing/.venv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 1683, in apply_chat_template
    rendered_chat = compiled_template.render(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/moroneyt/Documents/VLM-Testing/.venv/lib/python3.12/site-packages/jinja2/environment.py", line 1295, in render
    self.environment.handle_exception()
  File "/Users/moroneyt/Documents/VLM-Testing/.venv/lib/python3.12/site-packages/jinja2/environment.py", line 942, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 23, in top-level template code
TypeError: can only concatenate str (not "list") to str

The text was updated successfully, but these errors were encountered:

Blaizzy · 2025-01-30T00:05:28Z

This was fixed

The chat template was missing but it's back

Blaizzy · 2025-01-30T00:06:08Z

Just download it again

tmoroney · 2025-01-30T00:19:08Z

I have tried deleting the model and downloading it again but I am getting the same issue unfortunately.

https://huggingface.co/mlx-community/Qwen2.5-VL-3B-Instruct-3bit

tmoroney · 2025-01-30T01:18:57Z

I have checked and the chat template is included with the downloaded model files, so it seems that something else is causing this error.

Blaizzy · 2025-01-30T07:09:41Z

Could you share a reproducible script? Also please the pypi versions.

abaranovskis-redsamurai · 2025-01-30T07:17:05Z

Same error here.

Running:

 python -m mlx_vlm.generate --model mlx-community/Qwen2.5-VL-7B-Instruct-bf16 --max-tokens 100 --temp 0.0 --prompt "Describe this image." --image http://images.cocodataset.org/val2017/000000039769.jpg

transformers==4.48.1
torchvision==0.21.0
torch==2.6.0
mlx==0.22.0
mlx-vlm==0.1.12

Error:

Traceback (most recent call last): 94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌          | 4.93G/5.26G [27:50<01:43, 3.24MB/s]
  File "/Users/andrejb/.pyenv/versions/3.10.4/lib/python3.10/runpy.py", line 196, in _run_module_as_main███████████████████████████████████████████████████████████████████████████████████████████████████| 5.26G/5.26G [28:26<00:00, 9.41MB/s]
    return _run_code(code, main_globals, None,
  File "/Users/andrejb/.pyenv/versions/3.10.4/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/andrejb/Work/katana-git/sparrow/sparrow-data/parse/.env_sparrow_parse/lib/python3.10/site-packages/mlx_vlm/generate.py", line 156, in <module>
    main()
  File "/Users/andrejb/Work/katana-git/sparrow/sparrow-data/parse/.env_sparrow_parse/lib/python3.10/site-packages/mlx_vlm/generate.py", line 101, in main
    prompt = apply_chat_template(processor, config, prompt, num_images=len(args.image))
  File "/Users/andrejb/Work/katana-git/sparrow/sparrow-data/parse/.env_sparrow_parse/lib/python3.10/site-packages/mlx_vlm/prompt_utils.py", line 159, in apply_chat_template
    return processor.apply_chat_template(
  File "/Users/andrejb/Work/katana-git/sparrow/sparrow-data/parse/.env_sparrow_parse/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1687, in apply_chat_template
    rendered_chat = compiled_template.render(
  File "/Users/andrejb/Work/katana-git/sparrow/sparrow-data/parse/.env_sparrow_parse/lib/python3.10/site-packages/jinja2/environment.py", line 1295, in render
    self.environment.handle_exception()
  File "/Users/andrejb/Work/katana-git/sparrow/sparrow-data/parse/.env_sparrow_parse/lib/python3.10/site-packages/jinja2/environment.py", line 942, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 23, in top-level template code
TypeError: can only concatenate str (not "list") to str

evdc · 2025-01-30T07:17:30Z

This appears to be because the Qwen2.5-VL models come with a chat_template.json file (c.f. here), which has the jinja2 template as a string value in a json object.

The transformers library expects, by default, to find the chat template as jinja2 (not inside a JSON object) in a file called chat_template.jinja (c.f. here) under the model path. Not finding one, it defaults to a generic template, which doesn't match the format in which mlx_vlm.generate passes in its messages.

This can be temporarily worked around by copying the chat template from within the JSON file I linked (for example), save it as chat_template.jinja in the same directory as the model (e.g. /Users/<your user>/.cache/huggingface/hub/models--mlx-community--Qwen2.5-VL-3B-Instruct-3bit/snapshots/<some snapshot id>/ -- poke around with ls to find the right directory).

I'll post a minimal reproducible test in a minute

evdc · 2025-01-30T07:26:04Z

Alternately, this seems to work without having to modify downloaded files:

Change the function mlx_vlm.generate.get_model_and_processors to the following:

def get_model_and_processors(model_path, adapter_path):
    path = get_model_path(model_path)
    with open(path / "chat_template.json") as f:
        templ = json.loads(f.read())
        templ = templ["chat_template"]

    config = load_config(model_path, trust_remote_code=True)
    model, processor = load(
        model_path, adapter_path=adapter_path, lazy=False, trust_remote_code=True, chat_template=templ
    )
    return model, processor, config

we explicitly load and specify the chat template when loading the processor so it picks up the correct one.

Just verified working (latest main branch of this repo, transformers==4.48.1)

Blaizzy · 2025-01-30T07:32:50Z

That's interesting, because if you install transformers from source it works without any changes

v4.49.0.dev0

pip install git+https://github.com/huggingface/transformers

Blaizzy · 2025-01-30T07:33:54Z

v4.48.1 indeed causes the issue.

But if you install transformers from source it should work 👌🏽

abaranovskis-redsamurai · 2025-01-30T08:13:14Z

v4.48.1 indeed causes the issue.

But if you install transformers from source it should work 👌🏽

Yes, indeed it works with transformers from source. Thanks.

simonw · 2025-01-30T14:11:20Z

Thanks to that tip this worked for me:

uv run --with 'numpy<2' \
  --with 'git+https://github.com/huggingface/transformers' \
  --with mlx-vlm \
  python -m mlx_vlm.generate \
    --model mlx-community/Qwen2.5-VL-7B-Instruct-8bit \
    --max-tokens 100 \
    --temp 0.0 \
    --prompt "Describe this image." \
    --image path-to-image.png

Result on my blog: https://simonwillison.net/2025/Jan/27/qwen25-vl-qwen25-vl-qwen25-vl/#qwen-vl-mlx-vlm

Blaizzy · 2025-01-30T14:36:12Z

Most welcome!

Great article 🔥🙌🏽

I would love if we had a cookbook like that for mlx-vlm.

We already have a few recipes (here) but we definitely need more.

jrp2014 · 2025-01-31T16:29:27Z

This model is one of the few that blows up if you give it a large pic:

Terminating due to uncaught exception: [metal::malloc] Attempting to allocate 134767706112 bytes which is greater than the maximum allowed buffer size of 77309411328 bytes.

zsh: abort      python -m mlx_vlm.generate --model mlx-community/Qwen2.5-VL-7B-Instruct-bf16 
/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

dcup · 2025-02-06T06:43:13Z

Just have a digging on the error messages and it is because the preprocessor of Qwen2.5-VL is "Qwen2_5_VLProcessor" . However, this class is not included in current release of transformers ver V4.48.2.

This class is on the dev branch v4.49.0.dev0, so install from github will be ok.

RArchered · 2025-02-08T09:24:45Z

same error , installing transformers from github works:
pip install git+https://github.com/huggingface/transformers

Blaizzy · 2025-03-03T15:56:40Z

Issue has been resolved.

Closing it for now.

neilmehta24 mentioned this issue Jan 31, 2025

Add support for Qwen2.5 VL lmstudio-ai/mlx-engine#86

Merged

alew3 mentioned this issue Feb 2, 2025

Gradio option for Qwen2-VL-2B-Instruct-4bit - failed #198

Closed

Blaizzy closed this as completed Mar 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2.5-VL failing #192

Qwen2.5-VL failing #192

tmoroney commented Jan 29, 2025

Blaizzy commented Jan 30, 2025

Blaizzy commented Jan 30, 2025

tmoroney commented Jan 30, 2025 •

edited

Loading

tmoroney commented Jan 30, 2025

Blaizzy commented Jan 30, 2025

abaranovskis-redsamurai commented Jan 30, 2025

evdc commented Jan 30, 2025

evdc commented Jan 30, 2025

Blaizzy commented Jan 30, 2025

Blaizzy commented Jan 30, 2025 •

edited

Loading

abaranovskis-redsamurai commented Jan 30, 2025

simonw commented Jan 30, 2025 •

edited

Loading

Blaizzy commented Jan 30, 2025

jrp2014 commented Jan 31, 2025

dcup commented Feb 6, 2025

RArchered commented Feb 8, 2025

Blaizzy commented Mar 3, 2025

Qwen2.5-VL failing #192

Qwen2.5-VL failing #192

Comments

tmoroney commented Jan 29, 2025

Blaizzy commented Jan 30, 2025

Blaizzy commented Jan 30, 2025

tmoroney commented Jan 30, 2025 • edited Loading

tmoroney commented Jan 30, 2025

Blaizzy commented Jan 30, 2025

abaranovskis-redsamurai commented Jan 30, 2025

evdc commented Jan 30, 2025

evdc commented Jan 30, 2025

Blaizzy commented Jan 30, 2025

Blaizzy commented Jan 30, 2025 • edited Loading

abaranovskis-redsamurai commented Jan 30, 2025

simonw commented Jan 30, 2025 • edited Loading

Blaizzy commented Jan 30, 2025

jrp2014 commented Jan 31, 2025

dcup commented Feb 6, 2025

RArchered commented Feb 8, 2025

Blaizzy commented Mar 3, 2025

tmoroney commented Jan 30, 2025 •

edited

Loading

Blaizzy commented Jan 30, 2025 •

edited

Loading

simonw commented Jan 30, 2025 •

edited

Loading