Add tqdm to `apply_quantization_config` by kylesayrs · Pull Request #730 · vllm-project/compressed-tensors

kylesayrs · 2026-06-09T04:40:35Z

Purpose

When using disk offloading, creating offloaded quantization configs can take a decent amount of time. The user should be shown a tqdm to make it clear that the program is not hanging
As previously explored, the cost of match_named_modules for even very large models is negligible, so it's fine to compute the length of the matched modules upfront

Changes

Add tqdm to apply_quantization_config

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

coderabbitai · 2026-06-09T04:40:46Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3dd9e383-0f5d-43a2-8b08-01c35459ddc9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

The PR adds optional progress-bar display to quantization application. apply_quantization_config gains a show_progress: bool = True parameter; tqdm is imported and used to wrap the module-matching loop with a conditional progress bar controlled by disable=not show_progress. The function's docstring is updated to document the new parameter.

Changes

Progress Bar for Quantization Application

Layer / File(s)	Summary
Add optional progress bar to quantization application `src/compressed_tensors/quantization/lifecycle/apply.py`	`apply_quantization_config` signature is extended with `show_progress: bool = True`, the module-matching loop is refactored to materialize `matched_modules` and iterate with `tqdm` whose visibility is controlled via `disable=not show_progress`, `tqdm` is imported, and the docstring is updated to document the new parameter.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding tqdm to the apply_quantization_config function.
Description check	✅ Passed	The description clearly relates to the changeset, explaining the purpose (showing progress during quantization) and the specific change (adding tqdm).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kylesayrs/apply-tqdm

Warning

Review ran into problems

🔥 Problems

Linked repositories: Your configuration references 1 linked repositories, but your current plan allows 0. Analyzed ``, skipped vllm-project/llm-compressor.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

🧹 Nitpick comments (1)

src/compressed_tensors/quantization/lifecycle/apply.py (1)
136-144: 💤 Low value

Correct implementation of progress bar.

The implementation correctly materializes the matched modules and wraps the iteration with tqdm. The disable=not show_progress pattern matches the existing codebase convention from dispatch_with_map.
Optional: Consider adding unit parameter for clarity
 for name, submodule in tqdm(
     matched_modules,
     desc="Applying quantization config",
     disable=not show_progress,
+    unit="module",
 ):
This would make the progress bar display "X/Y modules" instead of "X/Y items", providing slightly clearer context to users.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/compressed_tensors/quantization/lifecycle/apply.py` around lines 136 -
144, The progress bar created when iterating over matched_modules should include
a unit label for clarity; update the tqdm call in apply.py (the loop that
iterates over matched_modules from match_named_modules) to pass a unit parameter
(e.g., unit="module" or unit="modules") so the progress display reads "X/Y
modules" instead of "X/Y items" while keeping disable=not show_progress and the
existing desc.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/compressed_tensors/quantization/lifecycle/apply.py`:
- Around line 136-144: The progress bar created when iterating over
matched_modules should include a unit label for clarity; update the tqdm call in
apply.py (the loop that iterates over matched_modules from match_named_modules)
to pass a unit parameter (e.g., unit="module" or unit="modules") so the progress
display reads "X/Y modules" instead of "X/Y items" while keeping disable=not
show_progress and the existing desc.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 967fc852-1946-40fa-a174-7442f64d99b0

📥 Commits

Reviewing files that changed from the base of the PR and between 063d8df and 61d77ff.

📒 Files selected for processing (1)

src/compressed_tensors/quantization/lifecycle/apply.py

tqdm

61d77ff

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs mentioned this pull request Jun 9, 2026

match_named_modules O(N×M) causes multi-minute silent load tail on fine-grained MoE checkpoints (Qwen3-30B-A3B-AWQ, ~18K Linear modules) #695

Closed

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

brian-dellabetta approved these changes Jun 9, 2026

View reviewed changes

Merge branch 'main' into kylesayrs/apply-tqdm

1ed0a29

HDCharles reviewed Jun 22, 2026

View reviewed changes

Comment thread src/compressed_tensors/quantization/lifecycle/apply.py

HDCharles approved these changes Jun 22, 2026

View reviewed changes

kylesayrs enabled auto-merge (squash) June 22, 2026 21:15

Merge branch 'main' into kylesayrs/apply-tqdm

65c6bb9

kylesayrs merged commit a3e9313 into main Jun 22, 2026
5 checks passed

kylesayrs deleted the kylesayrs/apply-tqdm branch June 22, 2026 21:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tqdm to `apply_quantization_config`#730

Add tqdm to `apply_quantization_config`#730
kylesayrs merged 3 commits into
mainfrom
kylesayrs/apply-tqdm

kylesayrs commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Review skipped

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kylesayrs commented Jun 9, 2026

Purpose

Changes

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading