Skip to content

Conversation

@cyyever
Copy link
Contributor

@cyyever cyyever commented Aug 24, 2025

Description

This PR adds a new quantization scheme: ADAQUANT, as introduced in the paper Opportunistic Block Dropout for Efficiently Training Large-scale Neural Networks through Federated Learning.

ADAQUANT converts float tensors into integer tensors. Combined with an additional compression process to pack low-bit integers, it can reach near 10X quantisation rate, as indicated in the following test results:

2025-08-24 11:10:17,096 - INFO - Quantized 147/147 params. Before quantization: 5716.26 MB. After quantization: 0.00 MB with meta: 602.34 MB.
2025-08-24 11:12:25,513 - INFO - Dequantized 147/147 params. Before dequantization: 5716.26 MB with meta: 602.34 MB. After dequantization: 5716.26 MB.

These results were reported by running according under NVFlare/examples/advanced/llm_hf with the command

python3 llm_hf_fl_job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_nf4 --job_dir ${PWD}/workspace/jobs/hf_sft_nf4 --train_mode SFT --quantize_mode adaquant

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Quick tests passed locally by running ./runtest.sh.
  • In-line docstrings updated.
  • Documentation updated.

@cyyever cyyever force-pushed the ada_quant branch 2 times, most recently from 136869e to cb5940e Compare August 24, 2025 03:59
Copy link
Collaborator

@ZiyueXu77 ZiyueXu77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good! I think it makes sense for it to be included as an option in our existing quantization filters. Regarding compression, we will discuss and decide if we will want to keep here or make it more general.

@cyyever
Copy link
Contributor Author

cyyever commented Aug 28, 2025

My general option is that we can refactor compression as an orthogonal optimisation component. However, due to DDL of my aim (Google Summer of Code), let's do it in later PRs and don't change this one..

@holgerroth
Copy link
Collaborator

Thanks for your contribution @cyyever. Let's wait until we have the 2.7 release branch before merging this into main. @yanchengnv , @chesterxgchen for viz

@ZiyueXu77 ZiyueXu77 mentioned this pull request Sep 4, 2025
1 task
@cyyever cyyever force-pushed the ada_quant branch 3 times, most recently from 0204f0d to 2717fd1 Compare September 8, 2025 03:29
Copy link
Collaborator

@ZiyueXu77 ZiyueXu77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question for quantized model and meta size

This reverts commit 4e8b356.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants