V100推理internVL-1.5问题 #870

NLP-Learning · 2024-05-06T11:39:01Z

[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input exit or quit to exit the conversation.
[INFO:swift] Input multi-line to switch to multi-line input mode.
[INFO:swift] Input reset-system to reset the system and clear the history.
[INFO:swift] Input clear to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 描述一下这张图片的内容
Input a media path or URL <<< https://img2.baidu.com/it/u=2085854734,3872819026&fm=253&fmt=auto&app=138&f=JPEG?w=762&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn
from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

您好，使用V100按您教程里的来做，最后还是会遇到这个问题，请问V100显卡能绕过这个问题吗？

The text was updated successfully, but these errors were encountered:

hjh0119 · 2024-05-07T05:59:12Z

拉下最新代码

--use_flash_attn false

NLP-Learning · 2024-05-07T11:51:07Z

拉下最新代码

--use_flash_attn false

感谢回复，但是还是报错，麻烦您再指点下：

[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 描述一下这张图片，尽可能详细
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 57, in _import_flash_attn
    from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
    outputs = self(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
    output = old_forward(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 1052, in forward
    outputs = self.model(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 859, in forward
    _import_flash_attn()
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 67, in _import_flash_attn
    raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.

命令如下：

CUDA_VISIBLE_DEVICES=6 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5-Int8/ --use_flash_attn false

hjh0119 · 2024-05-07T13:11:24Z

感谢回复，但是还是报错，麻烦您再指点下：

int8版本还没来得及做接入兼容原版的应该没问题

你可以尝试修改本地模型的config.json文件中的attn_implementation值，改为eager

NLP-Learning · 2024-05-07T15:49:36Z

感谢回复，但是还是报错，麻烦您再指点下：

int8版本还没来得及做接入兼容原版的应该没问题

你可以尝试修改本地模型的config.json文件中的attn_implementation值，改为eager

Int8按您说的改了attn_implementation值，但还是报错：

[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细描述一下这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2829, in _sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

然后不加载量化的，加载原版internvl-chat-v1_5，命令如下：

CUDA_VISIBLE_DEVICES=1,2 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5/ --use_flash_attn false

还是会报错：

[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细解释这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn
    from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internvl_chat.py", line 359, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
    outputs = self(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
    output = old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 1052, in forward
    outputs = self.model(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 859, in forward
    _import_flash_attn()
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 67, in _import_flash_attn
    raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.

但是把原版internvl-chat-v1_5的config.json的attn_implementation改成eager，再用上边的命令就可以了！感谢大佬的工作！

hjh0119 · 2024-05-07T16:01:58Z

感谢回馈我明天再修改下

hjh0119 · 2024-05-08T11:31:41Z

int8模型已兼容

对于不支持flash attention的gpu，可以使用use_flash_attn false 来正常训练和推理了

hjh0119 self-assigned this May 7, 2024

hjh0119 mentioned this issue May 7, 2024

support ORPO algorithm #854

Merged

4 tasks

hjh0119 mentioned this issue May 8, 2024

support Deepseek-V2-Chat and InternVL-Chat-V1.5-int8 model #876

Merged

4 tasks

hjh0119 closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V100推理internVL-1.5问题 #870

V100推理internVL-1.5问题 #870

NLP-Learning commented May 6, 2024

hjh0119 commented May 7, 2024

NLP-Learning commented May 7, 2024 •

edited

hjh0119 commented May 7, 2024

NLP-Learning commented May 7, 2024

hjh0119 commented May 7, 2024

hjh0119 commented May 8, 2024

V100推理internVL-1.5问题 #870

V100推理internVL-1.5问题 #870

Comments

NLP-Learning commented May 6, 2024

hjh0119 commented May 7, 2024

NLP-Learning commented May 7, 2024 • edited

hjh0119 commented May 7, 2024

NLP-Learning commented May 7, 2024

hjh0119 commented May 7, 2024

hjh0119 commented May 8, 2024

NLP-Learning commented May 7, 2024 •

edited