Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何在 VLLM 部署后使用 API 直接传递视频进行推理? #742

Open
XyWzzZ opened this issue Feb 8, 2025 · 7 comments
Open

Comments

@XyWzzZ
Copy link

XyWzzZ commented Feb 8, 2025

我在使用 VLLM 部署 Qwen2-VL-72B-Instruct,并尝试通过 API 进行视频推理。目前的 API 调用方式如下:

`openai_api_key = "None"
openai_api_base = "http://xxxxxx/v1"

client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)

chat_response = client.chat.completions.create(
model="Qwen2-VL-72B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{"type": "video_url", "video_url": {"url": "https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4"}},
{"type": "text", "text": "请描述这个视频的具体过程"},
]},
]
)
`
问题
1、目前 API 允许传递 video_url,但是否支持 直接上传视频文件 而不依赖外部 URL?
2、如果不支持,是否有官方推荐的方式来 直接传递本地视频文件 进行推理,而不是手动拆帧?
3、在 VLLM 部署环境下,是否需要特定的配置或参数才能使 API 识别并处理视频输入?
感谢解答!🙏

@948024326
Copy link

同问+1

@gymbeijing
Copy link

gymbeijing commented Feb 10, 2025

你好,可以问下你的Python版本,Pytorch版本和CUDA版本吗?我也正在部署VLLM,但遇到版本不匹配。

python -m xformers.info
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.5.1 with CUDA 1201 (you have 2.5.1+cu121)
Python 3.10.15 (you have 3.10.16)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details

@wulipc
Copy link

wulipc commented Feb 12, 2025

Yes, you can pass the local video file by: file:// + your local absolute path, for example:

video_url_for_local = "file:///your/local/path/to/v_3l7quTy4c2s.mp4"   #  file:// + your local absolute path
video_url_for_remote = "https://url/path/to/v_3l7quTy4c2s.mp4"

video_url = video_url_for_remote # or video_url_for_local
video_message = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
            {"type": "text", "text": "Could you go into detail about the content of this video?"},
            {"type": "video", "video": video_url, "total_pixels": 20480 * 28 * 28, "min_pixels": 16 * 28 * 28},
        ]
    }
]

More video example, please see: Video inference in https://github.com/QwenLM/Qwen2.5-VL?tab=readme-ov-file#using---transformers-to-chat

@XyWzzZ
Copy link
Author

XyWzzZ commented Feb 13, 2025

Yes, you can pass the local video file by: file:// + your local absolute path, for example:

video_url_for_local = "file:///your/local/path/to/v_3l7quTy4c2s.mp4" # file:// + your local absolute path
video_url_for_remote = "https://url/path/to/v_3l7quTy4c2s.mp4"

video_url = video_url_for_remote # or video_url_for_local
video_message = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{"type": "text", "text": "Could you go into detail about the content of this video?"},
{"type": "video", "video": video_url, "total_pixels": 20480 * 28 * 28, "min_pixels": 16 * 28 * 28},
]
}
]
More video example, please see: Video inference in https://github.com/QwenLM/Qwen2.5-VL?tab=readme-ov-file#using---transformers-to-chat

Hello, thank you for your answer, but I still haven't succeeded after trying. I am using vllm and openai api, and the error message is as follows:

Traceback (most recent call last):
File "/home/xueyanwen/Qwen/test_qwen_video.py", line 24, in
chat_response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_utils/_utils.py", line 279, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 859, in create
return self._post(
^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1283, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 960, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1049, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1098, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1049, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1098, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1064, in _request
raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Internal Server Error
(base) [xueyanwen@A2-H3C-R5300-0201-0304-xx Qwen]$ python test_qwen_video.py
Traceback (most recent call last):
File "/home/xueyanwen/Qwen/test_qwen_video.py", line 25, in
chat_response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_utils/_utils.py", line 279, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 859, in create
return self._post(
^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1283, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 960, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1064, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "6 validation errors for ValidatorIterator\n0.typed-dict.text\n Field required [type=missing, input_value={'type': 'video', 'video'...n/video/狼来了.mp4'}}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.10/v/missing\n0.typed-dict.type\n Input should be 'text' [type=literal_error, input_value='video', input_type=str]\n For further information visit https://errors.pydantic.dev/2.10/v/literal_error\n0.typed-dict.image_url\n Field required [type=missing, input_value={'type': 'video', 'video'...n/video/狼来了.mp4'}}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.10/v/missing\n0.typed-dict.type\n Input should be 'image_url' [type=literal_error, input_value='video', input_type=str]\n For further information visit https://errors.pydantic.dev/2.10/v/literal_error\n0.typed-dict.input_audio\n Field required [type=missing, input_value={'type': 'video', 'video'...n/video/狼来了.mp4'}}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.10/v/missing\n0.typed-dict.type\n Input should be 'input_audio' [type=literal_error, input_value='video', input_type=str]\n For further information visit https://errors.pydantic.dev/2.10/v/literal_error", 'type': 'BadRequestError', 'param': None, 'code': 400}

@XyWzzZ
Copy link
Author

XyWzzZ commented Feb 13, 2025

你好,可以问下你的Python版本,Pytorch版本和CUDA版本吗?我也正在部署VLLM,但遇到版本不匹配。

python -m xformers.info
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.5.1 with CUDA 1201 (you have 2.5.1+cu121)
Python 3.10.15 (you have 3.10.16)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details

Python 3.10.16 torch 2.5.1+cu124 cuda 12.4

@wulipc
Copy link

wulipc commented Feb 13, 2025

你好,可以问下你的Python版本,Pytorch版本和CUDA版本吗?我也正在部署VLLM,但遇到版本不匹配。

python -m xformers.info
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.5.1 with CUDA 1201 (you have 2.5.1+cu121)
Python 3.10.15 (you have 3.10.16)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details

看起来你是没有安装 flash attention 导致 backend 选择了 xformers,建议使用 flash attention 更高效,你可以通过下面代码自测:

from transformers.utils import is_flash_attn_2_available
print(is_flash_attn_2_available())

@wulipc
Copy link

wulipc commented Feb 13, 2025

Yes, you can pass the local video file by: file:// + your local absolute path, for example:
video_url_for_local = "file:///your/local/path/to/v_3l7quTy4c2s.mp4" # file:// + your local absolute path
video_url_for_remote = "https://url/path/to/v_3l7quTy4c2s.mp4"
video_url = video_url_for_remote # or video_url_for_local
video_message = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{"type": "text", "text": "Could you go into detail about the content of this video?"},
{"type": "video", "video": video_url, "total_pixels": 20480 * 28 * 28, "min_pixels": 16 * 28 * 28},
]
}
]
More video example, please see: Video inference in https://github.com/QwenLM/Qwen2.5-VL?tab=readme-ov-file#using---transformers-to-chat

Hello, thank you for your answer, but I still haven't succeeded after trying. I am using vllm and openai api, and the error message is as follows:

Traceback (most recent call last): File "/home/xueyanwen/Qwen/test_qwen_video.py", line 24, in chat_response = client.chat.completions.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_utils/_utils.py", line 279, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 859, in create return self._post( ^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1283, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 960, in request return self._request( ^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1049, in _request return self._retry_request( ^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1098, in _retry_request return self._request( ^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1049, in _request return self._retry_request( ^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1098, in _retry_request return self._request( ^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1064, in _request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Internal Server Error (base) [xueyanwen@A2-H3C-R5300-0201-0304-xx Qwen]$ python test_qwen_video.py Traceback (most recent call last): File "/home/xueyanwen/Qwen/test_qwen_video.py", line 25, in chat_response = client.chat.completions.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_utils/_utils.py", line 279, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/resources/chat/completions.py", line 859, in create return self._post( ^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1283, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 960, in request return self._request( ^^^^^^^^^^^^^^ File "/home/xueyanwen/miniconda3/lib/python3.12/site-packages/openai/_base_client.py", line 1064, in _request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "6 validation errors for ValidatorIterator\n0.typed-dict.text\n Field required [type=missing, input_value={'type': 'video', 'video'...n/video/狼来了.mp4'}}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.10/v/missing\n0.typed-dict.type\n Input should be 'text' [type=literal_error, input_value='video', input_type=str]\n For further information visit https://errors.pydantic.dev/2.10/v/literal_error\n0.typed-dict.image_url\n Field required [type=missing, input_value={'type': 'video', 'video'...n/video/狼来了.mp4'}}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.10/v/missing\n0.typed-dict.type\n Input should be 'image_url' [type=literal_error, input_value='video', input_type=str]\n For further information visit https://errors.pydantic.dev/2.10/v/literal_error\n0.typed-dict.input_audio\n Field required [type=missing, input_value={'type': 'video', 'video'...n/video/狼来了.mp4'}}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.10/v/missing\n0.typed-dict.type\n Input should be 'input_audio' [type=literal_error, input_value='video', input_type=str]\n For further information visit https://errors.pydantic.dev/2.10/v/literal_error", 'type': 'BadRequestError', 'param': None, 'code': 400}

看起来是 message 写的有问题,建议先尝试 ReadMe 里面的 message 进行测试; 祝好

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants