Skip to content

Conversation

andrey-churkin
Copy link
Contributor

@andrey-churkin andrey-churkin commented Oct 13, 2025

Changes

  • This PR introduces a new keep_axes parameter for TensorReducerBase. This parameter specifies the axes to preserve during the reduction operation. It is used to calculate the reduction axes during statistic collection, allowing us to avoid requiring the actual tensor shape (actually only number of dimensions ndim is required) before inference.

  • Modifies the SmoothQuant algorithm to use the keep_axes parameter for the ONNX backend instead of relying on the tensor shape from the NNCF graph, as this shape isn't always available.

Related tickets

Ref: 173880, Ref: 174334

Tests

  • Build post_training_quantization # 735

@andrey-churkin andrey-churkin requested a review from a team as a code owner October 13, 2025 07:07
Comment on lines +157 to +164

@staticmethod
def get_abs_max_reducer_cls() -> type[OVAbsMaxReducer]:
return OVAbsMaxReducer

@staticmethod
def get_shape_reducer_cls() -> type[OVShapeReducer]:
return OVShapeReducer
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We add the get_abs_max_reducer_cls() and get_shape_reducer_cls() methods here because the OpenVINO backend uses the OVAbsMaxReducer and OVShapeReducer classes instead of AbsMaxReducer and ShapeReducer to enable in-place statistic collection.

@andrey-churkin andrey-churkin changed the title SQ [ONNX][SmoothQuant] Introduce a new keep_axes parameter Oct 15, 2025
@andrey-churkin andrey-churkin added NNCF Common Pull request that updates NNCF Common NNCF ONNX Pull requests that updates NNCF ONNX labels Oct 15, 2025
@github-actions github-actions bot removed the NNCF ONNX Pull requests that updates NNCF ONNX label Oct 15, 2025
@daniil-lyakhov daniil-lyakhov self-requested a review October 15, 2025 09:28
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps add a test with an ONNX model for which ndim is not known beforehand to have an example of why keep_dims approach is introduced?

@andrey-churkin
Copy link
Contributor Author

andrey-churkin commented Oct 16, 2025

Should we perhaps add a test with an ONNX model for which ndim is not known beforehand to have an example of why keep_dims approach is introduced?

Thank you for the suggestion. I’ll consider how to implement it.

UPD: This problem is reproduced on timm/visformer_small model from the ptq scope.

def __init__(self, reduction_axes: Optional[ReductionAxes] = None, inplace: bool = False):
def __init__(
self,
reduction_axes: Optional[ReductionAxes] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we forward this parameter in the children of the TensorReducerBase?

def __init__(
self,
reduction_axes: Optional[ReductionAxes] = None,
keep_axes: Optional[tuple[int, ...]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
keep_axes: Optional[tuple[int, ...]] = None,
keep_axes: Optional[Axes] = None,

Perhaps we could rename ReductionAxes and reuse them there?


def __hash__(self) -> int:
return hash((self.__class__.__name__, self.inplace, self._reduction_axes))
return hash((self.__class__.__name__, self.inplace, self._reduction_axes, self._keep_axes))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should update __hash__ methods for some of the TensorReducerBase as well


def test_get_abs_max_channel_collector(self, inplace_statistics: bool):
backend = self.get_backend()
reduction_axes = (3, 2, 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test self._backend_entity.get_abs_max_reducer_cls() and _backend_entity.get_shape_reducer_cls

Comment on lines +307 to +312
if model_backend == BackendType.ONNX:
keep_axes = (self._backend_entity.get_activation_channel_axis(node_to_smooth, input_act_port),)
reduction_axes = None
else:
keep_axes = None
reduction_axes = self._calculate_input_reduction_axes(graph, node_to_smooth, input_act_port)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we create a method in the backend to resolve such situation, why don't you introduce a method in the backend? The comment could be placed as a docstring for the method

):
stats = tensor_collector.get_statistics()
shape = stats[SHAPE_BRANCH_KEY]
shape = tuple() if shape is None else tuple(shape.tolist())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When shape could be None?

Comment on lines 63 to +64
self._reduction_axes = reduction_axes
self._keep_axes = keep_axes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these 2 variables mutually exclusive?
If yes, I'd rename _reduction_axes to _axes and added _axes_mode - REDUCTION or KEEP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Code Freeze NNCF Common Pull request that updates NNCF Common

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants