[QuantizationMetadata] update for missing qparams and module attributes#669
[QuantizationMetadata] update for missing qparams and module attributes#669brian-dellabetta wants to merge 4 commits into
Conversation
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthrough
Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/compressed_tensors/quantization/quant_metadata.py (1)
65-87:⚠️ Potential issue | 🟠 MajorDocstring contradicts implementation.
Line 65 states "
quantization_statusandquantization_enabledare left unchanged," but lines 81-87 now delete these attributes. Please update the docstring to reflect the new behavior.Proposed fix
Remove all artifacts of quantization from module, non-recursively. Artifacts include any qparams, quantization_scheme, or wrapped forward method that might have been altered previously in lifecycle. - `quantization_status` and `quantization_enabled` are left unchanged. - :param module: Module to clear🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/compressed_tensors/quantization/quant_metadata.py` around lines 65 - 87, The docstring is out of sync: it claims "`quantization_status` and `quantization_enabled` are left unchanged" but the implementation (within the block using unwrap_offload_forward that unwraps module.forward, calls cls.clear_all_qparams, and deletes module.quantization_scheme, module.quantization_status, and module.quantization_enabled) actually removes those attributes; update the docstring in this method to state that quantization_status and quantization_enabled are deleted/cleared (matching the code) and mention that module.forward is unwrapped and qparams and quantization_scheme are cleared to keep docs consistent with the behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/compressed_tensors/quantization/quant_metadata.py`:
- Around line 29-43: The returned metadata list currently contains a duplicate
"weight_shape" because it's added explicitly and also generated by the
comprehension (base_name="weight" + suffix="shape"); update the construction in
quant_metadata.py to avoid duplicates by removing the explicit "weight_shape"
from the first concatenated list (keep KVCacheScaleType entries and
"weight_packed"), or alternatively remove "shape" from the suffix tuple and add
explicit "input_shape" and "output_shape" if you need them; ensure references to
KVCacheScaleType, base_name, and suffix in the list comprehension remain
unchanged.
---
Outside diff comments:
In `@src/compressed_tensors/quantization/quant_metadata.py`:
- Around line 65-87: The docstring is out of sync: it claims
"`quantization_status` and `quantization_enabled` are left unchanged" but the
implementation (within the block using unwrap_offload_forward that unwraps
module.forward, calls cls.clear_all_qparams, and deletes
module.quantization_scheme, module.quantization_status, and
module.quantization_enabled) actually removes those attributes; update the
docstring in this method to state that quantization_status and
quantization_enabled are deleted/cleared (matching the code) and mention that
module.forward is unwrapped and qparams and quantization_scheme are cleared to
keep docs consistent with the behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ebfd6244-37ea-4261-bff5-25d764a67f26
📒 Files selected for processing (1)
src/compressed_tensors/quantization/quant_metadata.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/compressed_tensors/quantization/quant_metadata.py`:
- Around line 80-86: clear_quantization() currently deletes
module.quantization_status and quantization_enabled which makes forward readers
like forward_quantize (in
src/compressed_tensors/quantization/lifecycle/forward.py) crash when they access
module.quantization_status directly; to fix, update the forward reader to
tolerate missing attrs by replacing direct accesses with guarded lookups (e.g.,
status = getattr(module, "quantization_status", None) and check status is not
None before comparing to QuantizationStatus.COMPRESSED, and similarly guard
quantization_enabled reads), then you can either keep attribute deletion or
change clear_quantization() to set those attributes to None to match the
docstring—ensure forward_quantize and any other readers use getattr-based guards
(use the suggested minimal guard around the base_name=="weight" /
QuantizationStatus.COMPRESSED check) and update the method docstring if you
change clear_quantization() behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 15391856-32ea-496d-b013-dc7916bc2342
📒 Files selected for processing (1)
src/compressed_tensors/quantization/quant_metadata.py
kylesayrs
left a comment
There was a problem hiding this comment.
I feel strongly that this is out of scope for the quantizationmetadata class, and should instead be handled by deleting unused parameters during decompressoon
|
This pull request has merge conflicts that must be resolved before it can be |
Add some missing qparams and attributes (quantization_status, quantization_enabled) to QuantizationMetadata which were previously missing, so they weren't being pruned correctly.
Summary by CodeRabbit