-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
如何在 VLLM 部署后使用 API 直接传递视频进行推理? #742
Comments
同问+1 |
你好,可以问下你的Python版本,Pytorch版本和CUDA版本吗?我也正在部署VLLM,但遇到版本不匹配。
|
Yes, you can pass the local video file by: video_url_for_local = "file:///your/local/path/to/v_3l7quTy4c2s.mp4" # file:// + your local absolute path
video_url_for_remote = "https://url/path/to/v_3l7quTy4c2s.mp4"
video_url = video_url_for_remote # or video_url_for_local
video_message = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{"type": "text", "text": "Could you go into detail about the content of this video?"},
{"type": "video", "video": video_url, "total_pixels": 20480 * 28 * 28, "min_pixels": 16 * 28 * 28},
]
}
] More video example, please see: |
Hello, thank you for your answer, but I still haven't succeeded after trying. I am using vllm and openai api, and the error message is as follows: Traceback (most recent call last): |
Python 3.10.16 torch 2.5.1+cu124 cuda 12.4 |
看起来你是没有安装 flash attention 导致 backend 选择了 xformers,建议使用 flash attention 更高效,你可以通过下面代码自测:
|
看起来是 message 写的有问题,建议先尝试 ReadMe 里面的 message 进行测试; 祝好 |
我在使用 VLLM 部署 Qwen2-VL-72B-Instruct,并尝试通过 API 进行视频推理。目前的 API 调用方式如下:
`openai_api_key = "None"
openai_api_base = "http://xxxxxx/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
chat_response = client.chat.completions.create(
model="Qwen2-VL-72B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{"type": "video_url", "video_url": {"url": "https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4"}},
{"type": "text", "text": "请描述这个视频的具体过程"},
]},
]
)
`
问题
1、目前 API 允许传递 video_url,但是否支持 直接上传视频文件 而不依赖外部 URL?
2、如果不支持,是否有官方推荐的方式来 直接传递本地视频文件 进行推理,而不是手动拆帧?
3、在 VLLM 部署环境下,是否需要特定的配置或参数才能使 API 识别并处理视频输入?
感谢解答!🙏
The text was updated successfully, but these errors were encountered: