You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When converting models for long contexts --context-length 65536 , The online convert job is currently fails produce some "token" models, when exporting, e.g: python -m qai_hub_models.models.llama_v3_2_3b_chat_quantized.export To Reproduce
python -m qai_hub_models.models.llama_v3_2_3b_chat_quantized.export --chipset qualcomm-snapdragon-8-elite --skip-inferencing --skip-profiling --context-length 65536 --output-dir genie_bundle_llama_v3_2_3b_chat_65536/
Stack trace
Uploading llama_v3_2_3b_chat_quantized_token_3_of_3.aimet.zip part 1 of 4
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.00G/1.00G [00:57<00:00, 18.8MB/s]
Uploading llama_v3_2_3b_chat_quantized_token_3_of_3.aimet.zip part 2 of 4
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.00G/1.00G [00:53<00:00, 20.2MB/s]
Uploading llama_v3_2_3b_chat_quantized_token_3_of_3.aimet.zip part 3 of 4
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.00G/1.00G [01:14<00:00, 14.4MB/s]
Uploading llama_v3_2_3b_chat_quantized_token_3_of_3.aimet.zip part 4 of 4
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 512M/512M [00:27<00:00, 19.7MB/s]
Scheduled compile job (jgo2dvqqp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgo2dvqqp/
Waiting for compile job (jgn61e2k5) completion. Type Ctrl+C to stop waiting at any time.
❌ FAILED Internal compiler error
Waiting for compile job (jp3jom33g) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/user/llm_on_genie_venv/lib/python3.10/site-packages/qai_hub_models/models/llama_v3_2_3b_chat_quantized/export.py", line 57, in <module>
main()
File "/home/user/llm_on_genie_venv/lib/python3.10/site-packages/qai_hub_models/models/llama_v3_2_3b_chat_quantized/export.py", line 46, in main
export_model(
File "/home/user/llm_on_genie_venv/lib/python3.10/site-packages/qai_hub_models/models/_shared/llama3/export.py", line 238, in export_model
link_job = hub.submit_link_job(models, name=full_name)
File "/home/user/llm_on_genie_venv/lib/python3.10/site-packages/qai_hub/client.py", line 4455, in submit_link_job
model.producer is None
AttributeError: 'NoneType' object has no attribute 'producer'
Describe the bug
When converting models for long contexts
--context-length 65536
, The online convert job is currently fails produce some "token" models, when exporting, e.g:python -m qai_hub_models.models.llama_v3_2_3b_chat_quantized.export
To Reproduce
python -m qai_hub_models.models.llama_v3_2_3b_chat_quantized.export --chipset qualcomm-snapdragon-8-elite --skip-inferencing --skip-profiling --context-length 65536 --output-dir genie_bundle_llama_v3_2_3b_chat_65536/
Stack trace
jgn61e2k5.log
Host configuration:
The text was updated successfully, but these errors were encountered: