Reuse the jobs for exporting llama3.3.2-3b model. #153

codereba · 2025-01-21T23:32:10Z

I try to export the llama to Snapdragon 8 Elite X by following the guidance:
https://github.com/quic/ai-hub-apps/tree/main/apps/android/ChatApp

I run the command:
python -m qai_hub_models.models.llama_v3_2_3b_chat_quantized.export --context-length 2048 --device "Snapdragon 8 Elite QRD" --output-dir genie_bundle
After waiting many hours, the error happened, please refer to:
#154

I rerun this command:
python -m qai_hub_models.models.llama_v3_2_3b_chat_quantized.export --context-length 2048 --device "Snapdragon 8 Elite QRD" --output-dir genie_bundle

That will redo all the completed steps, that consumes a lot of time and many resources to redo the works.(many hours for 10Mbps upload speed).

This patch will check whether the jobs exist, if that's true then let user to choose whether to use them directly (because the jobs may be different in the detail, but that's not a common case).

Please note:
I think the script can ignore the quantization steps if all the jobs exist, but that changes more, I need to get your suggestion firstly before I try to do it.

The patch is tested locally, it used the submitted jobs, and download the linked models, please refer to the screen shot:

Signed-off-by: balancesoft <[email protected]>

bhushan23 · 2025-01-24T18:22:12Z

Thank you very much @codereba for this change.

We are internally also working on general caching for all LLM export and will release in next release.
Loved how you have used job summaries to cache at export level. Amazing work :)

We look forward to seeing more contributions from you :)

codereba · 2025-01-25T22:21:46Z

Thank you very much @codereba for this change.

We are internally also working on general caching for all LLM export and will release in next release. Loved how you have used job summaries to cache at export level. Amazing work :)

We look forward to seeing more contributions from you :)

Yes, I try to contribute more :)

codereba mentioned this pull request Jan 21, 2025

[BUG] IndexError happened when make the inference jobs for ai-hub when exporting llama_v3_2_3b_chat, and redo all the completed steps if an error is generated at the last step. #154

Open

Reuse the jobs for exporting llama3.3.2-3b model.

705c66e

Signed-off-by: balancesoft <[email protected]>

codereba force-pushed the gocode/topic/reuseLastJobsForExportingLlama3_3_2_3b branch from 32d0416 to 705c66e Compare January 21, 2025 23:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse the jobs for exporting llama3.3.2-3b model. #153

Reuse the jobs for exporting llama3.3.2-3b model. #153

codereba commented Jan 21, 2025 •

edited

Loading

bhushan23 commented Jan 24, 2025

codereba commented Jan 25, 2025 •

edited

Loading

Reuse the jobs for exporting llama3.3.2-3b model. #153

Are you sure you want to change the base?

Reuse the jobs for exporting llama3.3.2-3b model. #153

Conversation

codereba commented Jan 21, 2025 • edited Loading

bhushan23 commented Jan 24, 2025

codereba commented Jan 25, 2025 • edited Loading

codereba commented Jan 21, 2025 •

edited

Loading

codereba commented Jan 25, 2025 •

edited

Loading