Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Clean up SSA and stop relying on memory edges for iterator flow #16345

Merged
merged 14 commits into from May 2, 2024

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Apr 26, 2024

This PR ended up being quite a bit larger than expected. I'm sorry!

This PR removes the last dependency on the IR-produced memory edges for dataflow. Some background information:

On main we implement dataflow through iterators by defining SSA reads from (and writes to) iterators by traversing back through the SSA graph to find the preceding call to container.begin(), and we then specify that a write to an iterator is a write to that container (and similar for reads). That is, in order to have flow in an example such as:

auto it = v.begin();
...
*it = source();
...
sink(v.at(0));

we have an SSA write to v at *it = source().

However, in order to conclude that it was obtained from v we need some kind of SSA to find the preceding call to container.begin(). On main we use the SSA information from the IR. However, as we've learned many times, the sound SSA information produced by the IR isn't well-suited for dataflow. In particular:

  • It required really specialized dataflow code since that's the only place where we relied on IR-based SSA
  • Since it's a sound SSA it means it fails to find the begin call on most real-world code since everything escapes (and thus can't be soundly tracked) according to the current IR alias analysis.
  • It became even more ill-suited once we merged sound IR.

This PR fixes this by instead computing iterator flow after dataflow SSA has finished. Thus, we have the full power of the dataflow SSA to find the call to begin. This is then much easier to plug into dataflow, and gives better results overall.

The most terrifying commit in this PR is 50775d0 which is a preparation to ea1b8a3 that implements getAnUltimateDefinition(). That's the actual predicate we use to go from *it to it = v.begin(), but in order to implement it properly I had to restructure quite a lot of the SSA predicates.

As a small bonus of 50775d0 it fixes a small amount of spurious flow and some inconsistency errors. The effect of this can be seen in 5f0efc1.

Finally, all of this leads up to 683fe26 which instantiates the shared SSA library to reason about iterators.

Commit-by-commit heavily recommended. I've split it up in (hopefully) meaningful chunks.

@github-actions github-actions bot added the C++ label Apr 26, 2024
@MathiasVP MathiasVP force-pushed the cleanup-ssa-and-iterator-flow branch from 2e7c1a1 to 48f7520 Compare April 28, 2024 22:42
@MathiasVP MathiasVP force-pushed the cleanup-ssa-and-iterator-flow branch 2 times, most recently from df6c268 to a7fab9a Compare April 28, 2024 22:56
@MathiasVP MathiasVP force-pushed the cleanup-ssa-and-iterator-flow branch from a7fab9a to 9de1f49 Compare April 28, 2024 23:02
@MathiasVP MathiasVP force-pushed the cleanup-ssa-and-iterator-flow branch from 9de1f49 to 401717d Compare April 29, 2024 08:38
@MathiasVP MathiasVP marked this pull request as ready for review April 29, 2024 20:39
@MathiasVP MathiasVP requested a review from a team as a code owner April 29, 2024 20:39
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Apr 29, 2024
@MathiasVP MathiasVP merged commit f7113e0 into github:main May 2, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants