Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V100推理internVL-1.5问题 #870

Closed
NLP-Learning opened this issue May 6, 2024 · 6 comments
Closed

V100推理internVL-1.5问题 #870

NLP-Learning opened this issue May 6, 2024 · 6 comments
Assignees

Comments

@NLP-Learning
Copy link

[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input exit or quit to exit the conversation.
[INFO:swift] Input multi-line to switch to multi-line input mode.
[INFO:swift] Input reset-system to reset the system and clear the history.
[INFO:swift] Input clear to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 描述一下这张图片的内容
Input a media path or URL <<< https://img2.baidu.com/it/u=2085854734,3872819026&fm=253&fmt=auto&app=138&f=JPEG?w=762&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn
from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

您好,使用V100按您教程里的来做,最后还是会遇到这个问题,请问V100显卡能绕过这个问题吗?

@hjh0119 hjh0119 self-assigned this May 7, 2024
@hjh0119 hjh0119 mentioned this issue May 7, 2024
4 tasks
@hjh0119
Copy link
Collaborator

hjh0119 commented May 7, 2024

拉下最新代码

--use_flash_attn false

@NLP-Learning
Copy link
Author

NLP-Learning commented May 7, 2024

拉下最新代码

--use_flash_attn false

感谢回复,但是还是报错,麻烦您再指点下:

[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 描述一下这张图片,尽可能详细
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 57, in _import_flash_attn
    from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
    outputs = self(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
    output = old_forward(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 1052, in forward
    outputs = self.model(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 859, in forward
    _import_flash_attn()
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 67, in _import_flash_attn
    raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.

命令如下:

CUDA_VISIBLE_DEVICES=6 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5-Int8/ --use_flash_attn false

@hjh0119
Copy link
Collaborator

hjh0119 commented May 7, 2024

感谢回复,但是还是报错,麻烦您再指点下:

int8版本还没来得及做接入兼容 原版的应该没问题

你可以尝试修改本地模型的config.json文件中的attn_implementation值,改为eager

@NLP-Learning
Copy link
Author

感谢回复,但是还是报错,麻烦您再指点下:

int8版本还没来得及做接入兼容 原版的应该没问题

你可以尝试修改本地模型的config.json文件中的attn_implementation值,改为eager

Int8按您说的改了attn_implementation值,但还是报错:

[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细描述一下这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2829, in _sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

然后不加载量化的,加载原版internvl-chat-v1_5,命令如下:

CUDA_VISIBLE_DEVICES=1,2 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5/ --use_flash_attn false

还是会报错:

[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细解释这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn
    from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internvl_chat.py", line 359, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
    outputs = self(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
    output = old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 1052, in forward
    outputs = self.model(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 859, in forward
    _import_flash_attn()
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 67, in _import_flash_attn
    raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.

但是把原版internvl-chat-v1_5config.jsonattn_implementation改成eager,再用上边的命令就可以了!感谢大佬的工作!

image

@hjh0119
Copy link
Collaborator

hjh0119 commented May 7, 2024

感谢回馈 我明天再修改下

@hjh0119
Copy link
Collaborator

hjh0119 commented May 8, 2024

int8模型已兼容

对于不支持flash attention的gpu,可以使用use_flash_attn false 来正常训练和推理了

@hjh0119 hjh0119 closed this as completed May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants