Adding Tests for CadenceFusedConvReluQuantizer #16358

RahulC7 · 2025-12-21T22:20:06Z

Summary:
A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this:

  input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output

a fused pattern quantizes them together like so:

  input → [quantize] → conv2d → relu → [dequantize] → output

We need to make a few changes in our framework to test this.

Change 1: We allow graph builders to return a 3rd element for fused patterns

For fused patterns like conv+relu, the quantization annotations are split across two nodes:

Output annotation is on the relu node (the final output of the fused pattern)
Input annotations are on the conv node (where the quantized inputs enter)

The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node).

Change 2: We check annotations on the correct nodes for fused patterns

The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes:

output_qspec is checked on the output node (relu)
input_qspec_map is checked on the input source node (conv)

This change is backwards-compatible: for non-fused patterns, both nodes are the same.

Differential Revision: D89630759

pytorch-bot · 2025-12-21T22:20:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16358

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

B200 runners are down due to network issues

✅ You can merge normally! (1 Unrelated Failure)

As of commit 4217e25 with merge base 63f41ea ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / android / run-emulator (gh) (#16137)
Timeout waiting for emulator to boot.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-12-21T22:20:16Z

@RahulC7 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89630759.

github-actions · 2025-12-21T22:20:55Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

This PR adds comprehensive test coverage for the CadenceFusedConvReluQuantizer and several other previously untested Cadence quantizers. The key innovation is extending the test framework to support fused quantization patterns, where multiple operations (e.g., conv2d + relu) are quantized as a single unit, requiring annotations to be split across different nodes in the computation graph.

Updated the graph builder function signature to optionally return a third element (input source node) for fused patterns
Modified the test assertion logic to check output annotations on the output node and input annotations on the input source node
Added 13 new test cases covering 6 previously untested quantizer classes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Summary: Pull Request resolved: pytorch#16358 A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-22T17:46:28Z

backends/cadence/aot/tests/test_quantizer_ops.py

+    def _build_layer_norm_graph(self) -> tuple[torch.fx.GraphModule, torch.fx.Node]:
+        """Build a simple graph with a layer_norm operation."""
+        # Input shape: (batch, features)
+        x = torch.randn(1, 10)
+        # normalized_shape must match the last dimension(s) of input
+        normalized_shape = [10]
+        gm = single_op_builder(
+            placeholders=(x,),
+            op=torch.ops.aten.layer_norm.default,
+            args=(x, normalized_shape),
+        )
+
+        layer_norm_nodes = gm.graph.find_nodes(
+            op="call_function",
+            target=torch.ops.aten.layer_norm.default,
+        )
+        self.assertEqual(
+            len(layer_norm_nodes), 1, "Should find exactly one layer_norm node"
+        )
+        # Add source_fn_stack metadata required by quantizer pattern matching
+        layer_norm_nodes[0].meta["source_fn_stack"] = [
+            ("layer_norm", torch.ops.aten.layer_norm.default)
+        ]
+        return gm, layer_norm_nodes[0]


This builder is inconsistent with the others. Most builders use GraphBuilder and include NodeMetadata with source_fn_stack at creation time. This builder uses single_op_builder and then manually adds source_fn_stack metadata after the fact. Consider either using GraphBuilder for consistency or documenting why single_op_builder is necessary for layer_norm.

Copilot · 2025-12-22T17:46:28Z

backends/cadence/aot/tests/test_quantizer_ops.py

+            # Find the index of this input node in the input source node's args
+            arg_index = None
+            args = input_source_node.args
+            assert isinstance(args, tuple)


Using Python's assert statement in test code is not ideal because it can be disabled with optimization flags. Consider using self.assertIsInstance(args, tuple) instead to ensure the check always runs.

Suggested change

assert isinstance(args, tuple)

self.assertIsInstance(args, tuple)

Summary: A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Summary: Pull Request resolved: pytorch#16358 A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Summary: A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Summary: Pull Request resolved: pytorch#16358 A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Differential Revision: D88899457

Differential Revision: D88955761

Summary: A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Summary: Pull Request resolved: pytorch#16358 A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-30T23:09:14Z

backends/cadence/aot/tests/test_quantizer_ops.py

+            if expected_input_qspecs[arg_index] is not None:
+                self.assertEqual(
+                    input_qspec,
+                    expected_input_qspecs[arg_index],
+                    f"Input qspec mismatch at arg index {arg_index}",
+                )


There's a potential IndexError when accessing expected_input_qspecs[arg_index]. While the code checks that len(input_annotation.input_qspec_map) equals len(expected_input_qspecs), it doesn't guarantee that arg_index will be within bounds. For example, if input_source_node has args at positions [0, 1, 2] and the input_qspec_map contains entries for args at positions [1, 2], the arg_index could be 2, but expected_input_qspecs might only have 2 elements (indices 0 and 1). Consider adding a bounds check before accessing expected_input_qspecs[arg_index].

Summary: Pull Request resolved: pytorch#16358 A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759

Copilot AI review requested due to automatic review settings December 21, 2025 22:20

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 21, 2025

meta-codesync bot added fb-exported meta-exported labels Dec 21, 2025

Copilot started reviewing on behalf of RahulC7 December 21, 2025 22:20 View session

Copilot AI reviewed Dec 21, 2025

View reviewed changes

hsharma35 approved these changes Dec 22, 2025

View reviewed changes

RahulC7 force-pushed the export-D89630759 branch from af22ffe to ccdb8e8 Compare December 22, 2025 15:52

Copilot AI review requested due to automatic review settings December 22, 2025 17:42

RahulC7 force-pushed the export-D89630759 branch from ccdb8e8 to fd92112 Compare December 22, 2025 17:42

Copilot started reviewing on behalf of RahulC7 December 22, 2025 17:42 View session

Copilot AI reviewed Dec 22, 2025

View reviewed changes

RahulC7 force-pushed the export-D89630759 branch from fd92112 to a0e3ae2 Compare December 29, 2025 22:39

RahulC7 force-pushed the export-D89630759 branch from a0e3ae2 to e91ba25 Compare December 29, 2025 22:42

RahulC7 force-pushed the export-D89630759 branch from e91ba25 to 441c5e6 Compare December 29, 2025 22:44

RahulC7 force-pushed the export-D89630759 branch from 441c5e6 to 9a072c1 Compare December 29, 2025 22:47

RahulC7 added 2 commits December 30, 2025 15:01

Adding Tests for CadenceDefaultQuantizer

2c0af7d

Differential Revision: D88899457

Changing logic to deal with graphs with derived quantization spec

636930a

Differential Revision: D88955761

Copilot AI review requested due to automatic review settings December 30, 2025 23:04

RahulC7 force-pushed the export-D89630759 branch from 9a072c1 to 6881523 Compare December 30, 2025 23:04

RahulC7 force-pushed the export-D89630759 branch 2 times, most recently from e8a8978 to 6881523 Compare December 30, 2025 23:05

Copilot started reviewing on behalf of RahulC7 December 30, 2025 23:05 View session

RahulC7 force-pushed the export-D89630759 branch from 6881523 to a98a49b Compare December 30, 2025 23:08

Copilot AI reviewed Dec 30, 2025

View reviewed changes

RahulC7 force-pushed the export-D89630759 branch from a98a49b to 4217e25 Compare December 30, 2025 23:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding Tests for CadenceFusedConvReluQuantizer #16358

Adding Tests for CadenceFusedConvReluQuantizer #16358

RahulC7 commented Dec 21, 2025

Uh oh!

pytorch-bot bot commented Dec 21, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Dec 21, 2025

Uh oh!

github-actions bot commented Dec 21, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 22, 2025

Uh oh!

Copilot AI Dec 22, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	assert isinstance(args, tuple)
	self.assertIsInstance(args, tuple)

Adding Tests for CadenceFusedConvReluQuantizer #16358

Are you sure you want to change the base?

Adding Tests for CadenceFusedConvReluQuantizer #16358

Conversation

RahulC7 commented Dec 21, 2025

Change 1: We allow graph builders to return a 3rd element for fused patterns

Change 2: We check annotations on the correct nodes for fused patterns

Uh oh!

pytorch-bot bot commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16358

❗ 1 Active SEVs

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

meta-codesync bot commented Dec 21, 2025

Uh oh!

github-actions bot commented Dec 21, 2025

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Dec 21, 2025 •

edited

Loading

This PR needs a `release notes:` label