reduce memory usage during repack #168

ReinForce-II · 2025-03-07T12:26:32Z

solve #166

wejoncy · 2025-03-07T12:49:40Z

Thanks for your contribution.

One question: Do you verify it of reducing memory usage? Could you please paste the number here?

Much appreciate.

ReinForce-II · 2025-03-07T14:20:49Z

python -m qllm --load Qwen/Qwen2.5-14B-Instruct-AWQ --eval --save /tmp/tmp0012 --pack_mode=GPTQ

w/o this change, peak memory usage is 66.8GiB.
w/ this change, peak memory usage is 23.0GiB.

I didn't do more tests.

reduce memory usage during repack

22690f7

wejoncy merged commit df20c15 into wejoncy:main Mar 7, 2025
2 checks passed

wejoncy linked an issue Mar 7, 2025 that may be closed by this pull request

Memory consumption about convert model format #166

Closed

Provide feedback