Skip to content

Conversation

anzr299
Copy link
Collaborator

@anzr299 anzr299 commented Aug 14, 2025

Changes

Remove Code to obtain Merge Node. Instead directly get the merge node from the match nodes.

Reason for changes

Matmul1 -> mul -> Matmul2 pattern was causing an error when the precision for Matmul1 was FP32. The first matmul node was not being found in all_weight_params.

According to my experiments, in both the conditions for patterns MatMul->Multiply->MatMul and Act->MatMul or Act->Multiply->MatMul we essentially do the same task of obtaining the first node from the matched list of nodes.

Tests

Category Job Status Job Number Notes
GH Examples Examples Pass 382
GH WC Weights Compression Pass 137

Perplexity Evaluation:
Compression Settings: Ratio=0.8, Hessian Input Activation, backup_mode NONE
Evaluation Settings: limit 10(evaluate with only 10 samples in lm_eval)
Model: microsoft/Phi-3-mini-128k-instruct

Branch AWQ Accuracy
Develop Branch No 11.157
Develop Branch Yes 11.0583
After Change Yes 11.0583

Related tickets

172326

@anzr299 anzr299 requested a review from a team as a code owner August 14, 2025 17:06
@anzr299 anzr299 marked this pull request as draft August 14, 2025 17:06
@anzr299 anzr299 marked this pull request as ready for review August 14, 2025 19:40
@daniil-lyakhov
Copy link
Collaborator

@anzr299, please check #2708 and ticket 141131. Looks like this code was added to support some LLMs, we need to make sure that this models are compressed as expected

@daniil-lyakhov daniil-lyakhov self-requested a review August 18, 2025 09:28
@anzr299
Copy link
Collaborator Author

anzr299 commented Aug 18, 2025

@anzr299, please check #2708 and ticket 141131. Looks like this code was added to support some LLMs, we need to make sure that this models are compressed as expected

Ah I see, I will apply another workaround instead.

@anzr299 anzr299 marked this pull request as draft September 8, 2025 09:35
@anzr299
Copy link
Collaborator Author

anzr299 commented Oct 21, 2025

@anzr299, please check #2708 and ticket 141131. Looks like this code was added to support some LLMs, we need to make sure that this models are compressed as expected

I have verified by checking the accuracy of the microsoft/Phi-3-mini-128k-instruct model before and after the change with AWQ enabled. It is the same. Looks like this change does not affect the model.

@anzr299 anzr299 marked this pull request as ready for review October 21, 2025 13:45
Copy link
Collaborator

@andreyanufr andreyanufr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked on llama (MatMul->Multiply->MatMul) and phi (activations->MatMul). The new changes do not affect the insertion location of the scale.

@AlexanderDokuchaev AlexanderDokuchaev merged commit 4ee9d00 into openvinotoolkit:develop Oct 23, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants