-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] genie-t2t-run Fails to run llama v2 7B quantized on Galaxy S23 Ultra #101
Comments
Hi, while my configuration and error was different from yours, i've succeeded in running llama2-7b in gen2-windows config. |
I had a similar mistake. I have updated the QNN version and the device used is honor phone (8gen3)
|
When running llama v2 7B quantized on QNN HTP backend of Snapdragon-Gen2,
the error is following.
What does it mean ?
"Could not create context from binary for context index = 0 : err 1009"
The Bin files (llama2_0.serialized.bin, llama2_1.serialized.bin, llama2_2.serialized.bin, llama2_3.serialized.bin) were generated with --target-gen snapdragon-gen2, as like below
python gen_ondevice_llama.py --hub-model-id m1q8lpygn,mrmdjx4km,mkngj646n,mknjj0gxn,meq2dy80m,mzmx5gykn,m6qejgw7m,mwn0p5d8m --output-dir ./export --tokenizer-zip-path ./tokenizer.zip --target-gen snapdragon-gen2 --target-os android
The text was updated successfully, but these errors were encountered: