Add ADAQUANT quantization scheme #3628

cyyever · 2025-08-24T03:42:38Z

Description

This PR adds a new quantization scheme: ADAQUANT, as introduced in the paper Opportunistic Block Dropout for Efficiently Training Large-scale Neural Networks through Federated Learning.

ADAQUANT converts float tensors into integer tensors. Combined with an additional compression process to pack low-bit integers, it can reach near 10X quantisation rate, as indicated in the following test results:

2025-08-24 11:10:17,096 - INFO - Quantized 147/147 params. Before quantization: 5716.26 MB. After quantization: 0.00 MB with meta: 602.34 MB.
2025-08-24 11:12:25,513 - INFO - Dequantized 147/147 params. Before dequantization: 5716.26 MB with meta: 602.34 MB. After dequantization: 5716.26 MB.

These results were reported by running according under NVFlare/examples/advanced/llm_hf with the command

python3 llm_hf_fl_job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_nf4 --job_dir ${PWD}/workspace/jobs/hf_sft_nf4 --train_mode SFT --quantize_mode adaquant

Types of changes

Non-breaking change (fix or new feature that would not break existing functionality).
Breaking change (fix or new feature that would cause existing functionality to change).
New tests added to cover the changes.
Quick tests passed locally by running ./runtest.sh.
In-line docstrings updated.
Documentation updated.

nvflare/app_opt/pt/quantization/ada_quant.py

ZiyueXu77

overall looks good! I think it makes sense for it to be included as an option in our existing quantization filters. Regarding compression, we will discuss and decide if we will want to keep here or make it more general.

cyyever · 2025-08-28T16:23:39Z

My general option is that we can refactor compression as an orthogonal optimisation component. However, due to DDL of my aim (Google Summer of Code), let's do it in later PRs and don't change this one..

holgerroth · 2025-08-29T21:52:32Z

Thanks for your contribution @cyyever. Let's wait until we have the 2.7 release branch before merging this into main. @yanchengnv , @chesterxgchen for viz

ZiyueXu77

one question for quantized model and meta size

nvflare/app_opt/pt/quantization/ada_quant.py

nvflare/app_opt/pt/quantization/constant.py

Signed-off-by: cyy <[email protected]>

Signed-off-by: Yuanyuan Chen <[email protected]>

This reverts commit 4e8b356.

cyyever force-pushed the ada_quant branch 2 times, most recently from 136869e to cb5940e Compare August 24, 2025 03:59

holgerroth requested review from ZiyueXu77 and holgerroth August 25, 2025 12:56