Skip to content

Conversation

@igogo-x86
Copy link

When a block has several Vregs that share the same penalty vector, the per-Vreg penalty tweaks in color_block_initialize don’t enforce a consistent ordering. We can end up with the right physreg set but in a different order than predecessors, so extra edge copies get inserted. After coloring, reorder shared-penalty groups to match the ordering chosen by already-processed predecessors when they agree, keeping the pattern stable and avoiding those redundant copies.

This is guarded to AArch64: without xchg, an inconsistent ordering forces a copy2 that lowers to three 4-byte instructions, whereas x86 would emit a single 2-byte xchg.

When a block has several Vregs that share the same penalty vector, the
per-Vreg penalty tweaks in color_block_initialize don’t enforce a
consistent ordering. We can end up with the right physreg set but in a
different order than predecessors, so extra edge copies get inserted.
After coloring, reorder shared-penalty groups to match the ordering
chosen by already-processed predecessors when they agree, keeping the
pattern stable and avoiding those redundant copies.

This is guarded to AArch64: without xchg, an inconsistent ordering
forces a copy2 that lowers to three 4-byte instructions, whereas x86
would emit a single 2-byte xchg.
@meta-codesync
Copy link

meta-codesync bot commented Dec 17, 2025

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D89387615. (Because this pull request was imported automatically, there will not be any future comments.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant