Skip to content

Updated How Codegen Handles Unsupported Memlets #2033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

philip-paul-mueller
Copy link
Collaborator

There was a very subtle error in codegen.
At the beginning codegen will scan all edges and translate anyone, that it can not handle by a native call, into a Map, for this the CopyToMap transformation was used.
The error arises when there is not one but multiple edges between the same AccessNodes, since the codegen passes source and destination node to the transformation and not the edge in question and CopyToMap only translated a single edge is translated, there is no guarantee that the right one was translated, which is an obviously wrong behaviour.

In this PR the functionality of turning an edge into a Map was moved to a function, memlet_to_map() inside the memlet_utils module, that operated on a edge, thus it is more targeted (or targeting the object in question at all).
The code generator was also patched to use this function (it is not possible to use the new CopyToMap transformation, because it is not possible to select the edge that should be translated and we only one to translate the edge that is not supported or the wrong one).

For compatibility it keeps the CopyToMap transformation around, however, it now uses the new helper functions.
Furthermore, it now transforms all edges between the two nodes, the only logical operation.

In GT4Py we have found that the implicit conversion of Maps by the code generator caused more harm than good.
Thus, this PR also adds a new configuration flag that allows to turn the implicit conversion of Memlets to Maps into errors,
by default it is still allowed.

This is needed because we have to replace `CopyToMap`.
…nto a Map.

The reason for this is quite simple.
Before the code generator used `CopyToMap` to turn unsupported Memlets into Maps.
However, that transformation operated on AccessNodes, but if there are multiple edges between the nodes then it was unspecific whcih one was turned.
In the worst case the one that would be supported was turned into a Map but the unsupported not.
Since the `memlet_to_map()` function operates on edges this issue is no longer possible.
…()` function, this also modifies (in the sense of fixing a bug) its behaviour.

The bug happens in case there are multiple edges between the AccessNodes.
One issue is that while `can_be_applied()` checked all edges, `apply()` only translated `state.edges_between(self.a, self.b)[0]`.
Which might some random edge.

The new behaviour will translate all edges that are appropriate,
Which is the only thing that make sense.
… if implicit Memlet to Map translation is performed.
In fact I just relocated some of the `CopyToMap` tests.
@philip-paul-mueller philip-paul-mueller added bug Something isn't working codegen labels Jun 6, 2025
@acalotoiu acalotoiu requested a review from tbennun June 11, 2025 07:26
philip-paul-mueller added a commit to GridTools/gt4py that referenced this pull request Jun 11, 2025
If the DaCe GPU code generator encounters a Memlet that can not be
expressed as a `cudaMemcpy*()` call, then it converts it to a Map.
However, the issue is that these Maps have the wrong iteration order,
i.e. wrong memory access pattern and it might not even launch, because
of too many blocks in the `y` direction of the compute grid.
For this reason GT4Py has to handle these Memlets explicitly.

However, [DaCe PR#1976](spcl/dace#1976) changed
this slightly and thus GT4Py had to follow.
Note, that some of these changes were already introduced by [GT4Py
PR#2004](#2004), however, they
were made for the original version of the DaCe PR (and the GT4Py PR had
to be merged before the DaCe PR was merged).

Furthermore, this PR fixes a different issue, also related to the
expansion of Memlets, which can be found in [DaCe
PR#2033](spcl/dace#2033) (it is not yet merged
and currently at commit `19b6bba`).
It fixes a bug in how the Memlets are expanded.
The DaCe PR also adds the possibility to generate an error instead of
slightly converting Memlets into Maps and this PR enables this feature.

---------

Co-authored-by: edopao <[email protected]>
edopao added a commit to GridTools/gt4py that referenced this pull request Jun 12, 2025
Use new commit of
[dace:gt4py-next-integration](https://github.com/GridTools/dace/tree/gt4py-next-integration)
branch that includes the following dace PRs:
- [NVTX Ranges](philip-paul-mueller/dace#2)
- [Update the handling of not supported
Memlets](spcl/dace#2033)
- [PruneConnector now uses the isolation function for nested
SDFG](spcl/dace#2040)
- [Modifications of the DaCe Default
Configuration](GridTools/dace#6)
@tbennun tbennun enabled auto-merge June 28, 2025 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working codegen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants