Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to Run Benchmark for llama-3-8b-instruct and llama-3.1-8b-instruct Models. #822

Open
tim102187S opened this issue Sep 4, 2024 · 2 comments
Assignees
Labels
category: llm_bench Label for tool/llm_bench folder

Comments

@tim102187S
Copy link

tim102187S commented Sep 4, 2024

I attempted to run benchmarks for the llama-3-8b-instruct and llama-3.1-8b-instruct models using both CPU and GPU, but the process failed. (I successfully tested the llama2-7b-chatbot model)

I followed the instructions in openvino_notebooks/llm-chatbot.ipynb to download the models and ensured that all necessary files (including the required tokenizer.model) were included. I am using the latest version of OpenVINO (2024.3.0) and have also upgraded the transformers library.

The command I executed is:
python benchmark.py -m {path}/openvino_notebooks/notebooks/llm-chatbot/llama-3-8b-instruct/INT4_compressed_weights -n 2 -d CPU -p "What is large language model (LLM)?"

And I received the following error output:
Screenshot from 2024-09-04 16-56-13

@peterchen-intel
Copy link
Collaborator

Adding -ic 512 option should work around this issue.
We have a PR to fix this issue, but there is a performance regression, WIP on the analysis.

@tim102187S
Copy link
Author

Thank you for your response. Adding the -ic 512 option indeed resolves the issue.

When can we expect the full solution to be available?

@andrei-kochin andrei-kochin added the category: llm_bench Label for tool/llm_bench folder label Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: llm_bench Label for tool/llm_bench folder
Projects
None yet
Development

No branches or pull requests

3 participants