[ONNX][SmoothQuant] Introduce a new keep_axes parameter #3687

andrey-churkin · 2025-10-13T07:07:39Z

Changes

This PR introduces a new keep_axes parameter for TensorReducerBase. This parameter specifies the axes to preserve during the reduction operation. It is used to calculate the reduction axes during statistic collection, allowing us to avoid requiring the actual tensor shape (actually only number of dimensions ndim is required) before inference.
Modifies the SmoothQuant algorithm to use the keep_axes parameter for the ONNX backend instead of relying on the tensor shape from the NNCF graph, as this shape isn't always available.

Related tickets

Ref: 173880, Ref: 174334

Tests

Build post_training_quantization # 735

andrey-churkin · 2025-10-15T06:27:38Z

src/nncf/quantization/algorithms/smooth_quant/openvino_backend.py

+
+    @staticmethod
+    def get_abs_max_reducer_cls() -> type[OVAbsMaxReducer]:
+        return OVAbsMaxReducer
+
+    @staticmethod
+    def get_shape_reducer_cls() -> type[OVShapeReducer]:
+        return OVShapeReducer


We add the get_abs_max_reducer_cls() and get_shape_reducer_cls() methods here because the OpenVINO backend uses the OVAbsMaxReducer and OVShapeReducer classes instead of AbsMaxReducer and ShapeReducer to enable in-place statistic collection.

nikita-savelyevv

Should we perhaps add a test with an ONNX model for which ndim is not known beforehand to have an example of why keep_dims approach is introduced?

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

andrey-churkin · 2025-10-16T07:12:42Z

Should we perhaps add a test with an ONNX model for which ndim is not known beforehand to have an example of why keep_dims approach is introduced?

Thank you for the suggestion. I’ll consider how to implement it.

UPD: This problem is reproduced on timm/visformer_small model from the ptq scope.

daniil-lyakhov · 2025-10-16T14:47:17Z

src/nncf/experimental/common/tensor_statistics/collectors.py

-    def __init__(self, reduction_axes: Optional[ReductionAxes] = None, inplace: bool = False):
+    def __init__(
+        self,
+        reduction_axes: Optional[ReductionAxes] = None,


Should we forward this parameter in the children of the TensorReducerBase?

daniil-lyakhov · 2025-10-16T14:47:54Z

src/nncf/experimental/common/tensor_statistics/collectors.py

+    def __init__(
+        self,
+        reduction_axes: Optional[ReductionAxes] = None,
+        keep_axes: Optional[tuple[int, ...]] = None,


Suggested change

keep_axes: Optional[tuple[int, ...]] = None,

keep_axes: Optional[Axes] = None,

Perhaps we could rename ReductionAxes and reuse them there?

daniil-lyakhov · 2025-10-16T14:49:44Z

src/nncf/experimental/common/tensor_statistics/collectors.py


    def __hash__(self) -> int:
-        return hash((self.__class__.__name__, self.inplace, self._reduction_axes))
+        return hash((self.__class__.__name__, self.inplace, self._reduction_axes, self._keep_axes))


Perhaps we should update __hash__ methods for some of the TensorReducerBase as well

daniil-lyakhov · 2025-10-16T15:01:23Z

tests/cross_fw/test_templates/test_smooth_quant.py


-    def test_get_abs_max_channel_collector(self, inplace_statistics: bool):
-        backend = self.get_backend()
-        reduction_axes = (3, 2, 1)


Please test self._backend_entity.get_abs_max_reducer_cls() and _backend_entity.get_shape_reducer_cls

daniil-lyakhov · 2025-10-16T15:04:09Z

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

+            if model_backend == BackendType.ONNX:
+                keep_axes = (self._backend_entity.get_activation_channel_axis(node_to_smooth, input_act_port),)
+                reduction_axes = None
+            else:
+                keep_axes = None
+                reduction_axes = self._calculate_input_reduction_axes(graph, node_to_smooth, input_act_port)


Usually we create a method in the backend to resolve such situation, why don't you introduce a method in the backend? The comment could be placed as a docstring for the method

daniil-lyakhov · 2025-10-16T15:08:10Z

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

+            ):
+                stats = tensor_collector.get_statistics()
+                shape = stats[SHAPE_BRANCH_KEY]
+                shape = tuple() if shape is None else tuple(shape.tolist())


When shape could be None?

ljaljushkin · 2025-10-17T13:52:57Z

src/nncf/experimental/common/tensor_statistics/collectors.py

        self._reduction_axes = reduction_axes
+        self._keep_axes = keep_axes


Do these 2 variables mutually exclusive?
If yes, I'd rename _reduction_axes to _axes and added _axes_mode - REDUCTION or KEEP

minor improvements

6a0fbf2

andrey-churkin requested a review from a team as a code owner October 13, 2025 07:07

andrey-churkin added 2 commits October 13, 2025 08:19

minor update

17062a6

retrieve shapes from stats

2d08025

ljaljushkin added the Code Freeze label Oct 13, 2025

andrey-churkin added 7 commits October 13, 2025 11:09

raw version

57d828f

minor fix

a801647

update

53c6d72

minor fix

97cab6a

fix

3a96172

fix

8f16b32

fix

acb88d7

andrey-churkin requested a review from nikita-savelyevv October 14, 2025 10:07

andrey-churkin added 3 commits October 14, 2025 13:30

improve code

104dea5

minor fix

4c11627

update

8b2ea5f

andrey-churkin commented Oct 15, 2025

View reviewed changes

andrey-churkin changed the title SQ [ONNX][SmoothQuant] Introduce a new keep_axes parameter Oct 15, 2025

andrey-churkin added NNCF Common Pull request that updates NNCF Common NNCF ONNX Pull requests that updates NNCF ONNX labels Oct 15, 2025

add tests

dd61ed7

github-actions bot removed the NNCF ONNX Pull requests that updates NNCF ONNX label Oct 15, 2025

daniil-lyakhov self-requested a review October 15, 2025 09:28

nikita-savelyevv reviewed Oct 15, 2025

View reviewed changes

Apply comments

92dee3e

daniil-lyakhov reviewed Oct 16, 2025

View reviewed changes

ljaljushkin reviewed Oct 17, 2025

View reviewed changes

	keep_axes: Optional[tuple[int, ...]] = None,
	keep_axes: Optional[Axes] = None,

		self._reduction_axes = reduction_axes
		self._keep_axes = keep_axes

[ONNX][SmoothQuant] Introduce a new keep_axes parameter #3687

Are you sure you want to change the base?

[ONNX][SmoothQuant] Introduce a new keep_axes parameter #3687

Uh oh!

Conversation

andrey-churkin commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Related tickets

Tests

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikita-savelyevv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andrey-churkin commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andrey-churkin commented Oct 13, 2025 •

edited

Loading

andrey-churkin commented Oct 16, 2025 •

edited

Loading