-
Notifications
You must be signed in to change notification settings - Fork 237
-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash when out token length is > 64 #633
Comments
Hi @Edward-Lin, Thanks for reporting, I will try to reproduce first. Could you also help collect the information by running https://github.com/intel/intel-extension-for-pytorch/blob/main/scripts/collect_env.py and upload the result here? Thanks! btw, are you referring to the guide here: https://intel.github.io/intel-extension-for-pytorch/llm/llama3/xpu/ |
ipex-collect-env.txt |
|
I used to test llama3, when I set max_new_tokens > 64 on windows Ultra Core platform, U9 185 32GB, the program would crash.
attached the code, and logs.
Uploading log_crash.txt…
Uploading run_generation_gpu_woq_for_llama.py.txt…
The text was updated successfully, but these errors were encountered: