Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2-VL-7B can not run TP=2 on 2*ARC #12934

Open
Zjq9409 opened this issue Mar 5, 2025 · 1 comment
Open

Qwen2-VL-7B can not run TP=2 on 2*ARC #12934

Zjq9409 opened this issue Mar 5, 2025 · 1 comment

Comments

@Zjq9409
Copy link

Zjq9409 commented Mar 5, 2025

4*arc workstation
start docker command:
#/bin/bash export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:latest export CONTAINER_NAME=ipex-llm-serving-xpu-container_1 sudo docker run -it \ --privileged \ --net=host \ --device=/dev/dri \ -v /home/test/:/llm/models \ -e no_proxy=localhost,127.0.0.1 \ --memory="32G" \ --name=$CONTAINER_NAME \ --shm-size="16g" \ --entrypoint /bin/bash \ $DOCKER_IMAGE

MODEL_PATH="/llm/models/LLM/Qwen2-VL-7B-Instruct/"
SERVED_MODEL_NAME="Qwen2-VL-7B-Instruct/"
TENSOR_PARALLEL_SIZE=2 # Default to 1 if not set

`(WrapperWithLoadBit pid=786) -----> current rank: 1, world size: 2, byte_count: 21504000
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] Error executing method determine_num_available_blocks. This might cause deadlock in distributed execution.
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] Traceback (most recent call last):
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/worker_base.py", line 461, in execute_method
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return executor(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return func(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/xpu_worker.py", line 106, in determine_num_available_blocks
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] self.model_runner.profile_run()
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return func(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/xpu_model_runner.py", line 821, in profile_run
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] self.execute_model(model_input, kv_caches, intermediate_tensors)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return func(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/xpu_model_runner.py", line 931, in execute_model
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] hidden_or_intermediate_states = model_executable(
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return self._call_impl(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return forward_call(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1327, in forward
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] hidden_states = self.language_model.model(
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/compilation/decorators.py", line 168, in call
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return self.forward(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2.py", line 340, in forward
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] hidden_states, residual = layer(
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return self._call_impl(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] return forward_call(*args, **kwargs)
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2.py", line 247, in forward
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] hidden_states = self.self_attn(
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] ^^^^^^^^^^^^^^^
(WrapperWithLoadBit pid=786) ERROR 03-05 14:13:23 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_implERROR 03-05 14:13:42 worker_base.py:469] Error executing method determine_num_available_blocks. This might cause deadlock in distributed execution.
ERROR 03-05 14:13:42 worker_base.py:469] Traceback (most recent call last):
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/worker_base.py", line 461, in execute_method
ERROR 03-05 14:13:42 worker_base.py:469] return executor(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 03-05 14:13:42 worker_base.py:469] return func(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/xpu_worker.py", line 106, in determine_num_available_blocks
ERROR 03-05 14:13:42 worker_base.py:469] self.model_runner.profile_run()
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 03-05 14:13:42 worker_base.py:469] return func(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/xpu_model_runner.py", line 821, in profile_run
ERROR 03-05 14:13:42 worker_base.py:469] self.execute_model(model_input, kv_caches, intermediate_tensors)
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 03-05 14:13:42 worker_base.py:469] return func(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/worker/xpu_model_runner.py", line 931, in execute_model
ERROR 03-05 14:13:42 worker_base.py:469] hidden_or_intermediate_states = model_executable(
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return self._call_impl(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return forward_call(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2_vl.py", line 1327, in forward
ERROR 03-05 14:13:42 worker_base.py:469] hidden_states = self.language_model.model(
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/compilation/decorators.py", line 168, in call
ERROR 03-05 14:13:42 worker_base.py:469] return self.forward(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2.py", line 340, in forward
ERROR 03-05 14:13:42 worker_base.py:469] hidden_states, residual = layer(
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return self._call_impl(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return forward_call(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2.py", line 247, in forward
ERROR 03-05 14:13:42 worker_base.py:469] hidden_states = self.self_attn(
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return self._call_impl(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return forward_call(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2.py", line 176, in forward
ERROR 03-05 14:13:42 worker_base.py:469] attn_output = self.attn(q,
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return self._call_impl(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl
ERROR 03-05 14:13:42 worker_base.py:469] return forward_call(*args, **kwargs)
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/attention/layer.py", line 134, in forward
ERROR 03-05 14:13:42 worker_base.py:469] return self.impl.forward(query,
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] File "/usr/local/lib/python3.11/dist-packages/vllm/attention/backends/ipex_attn.py", line 449, in forward
ERROR 03-05 14:13:42 worker_base.py:469] sub_out = xe_addons.sdp_causal(
ERROR 03-05 14:13:42 worker_base.py:469] ^^^^^^^^^^^^^^^^^^^^^
ERROR 03-05 14:13:42 worker_base.py:469] RuntimeError: UR backend failed. UR backend returns:40 (UR_RESULT_ERROR_OUT_OF_RESOURCES)

`

@gc-fu gc-fu assigned gc-fu and unassigned gc-fu Mar 5, 2025
@gc-fu
Copy link
Contributor

gc-fu commented Mar 5, 2025

Hi, can you try to comment out source /opt/intel/1ccl-wks/setvars.sh and run again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants