Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor alias analysis (WIP) #819

Open
wants to merge 4 commits into
base: scala-2
Choose a base branch
from
Open

Conversation

gsps
Copy link
Contributor

@gsps gsps commented Aug 9, 2020

No description provided.

Comment on lines +207 to +212
// NOTE(gsps): I believe this check is redundant except for the cases where we try to
// select class fields on expressions whose type is an abstract type def.
// We have a test case (extraction/valid/TypeMembers2.scala) in which we expand an
// abstract base class' method to a dispatcher, and such type defs appear. Simply omitting
// the effect by returning the empty set in such cases actually seems an odd choice to me,
// though I can't see how to exploit it for unsoundness.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rather sounds like a bug to me. I'm not sure I understand why we get the None case here, though. Shouldn't asClassType resolve the type def to a real class?

@romac: I think you wrote this code, why are we using an Option in rec? Shouldn't we just crash if we find a class selector on a non-class type (after abstract type resolution)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a bit complicated, I spent a couple of hours grokking all this yesterday. Basically, the abstract method in the test case gets expanded to something like

// Abstract method c on M
def c(thiss: M, t: thiss.T): Unit = {
  if (thiss.isInstanceOf[F]) c_(thiss.asInstanceOf[F], t)
  else ...
}

// Concrete method c on F
def c_(thiss: F, t_: thiss.T): Unit = { t_.x += 2 }

At the call site of c_ we try to relate some effects on t_ (the parameter of c_) to t (the argument being passed here). This involves taking the effect on t_.x from c_ and "rebasing" it on t. Since we aren't flow-sensitive, the type of t in c is still an abstract thiss.T, though, so the sanity check fails here.

Note that initially I thought none of this might be a problem and we could just call computeTargets anyways, but it turns out that we reject it later (with a MalformedStainlessCode exception, I believe). I forgot where exactly where it fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand, at the call site in test do we see a call to c or c_? I would expect the Scala compiler to know that we're calling c_ here since the receiver type is concrete.

If it's a call to c_, then we shouldn't need any flow analysis to determine that thiss.T has a concrete override since thiss should have type F. What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't very clear here, I was just talking about the snippet I had posted. You can completely ignore test, the issue also occurs without it.

It's a call to c_, but a synthetic one that we introduced when expanding the abstract c to something that model virtual dispatch. So there's no chance for scalac to infer anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, I see.

I wonder if we should be injecting a cast on t in the synthetic dispatch then. Something like t.asInstanceOf[thiss.asInstanceOf[F].T]? I assume this would allow type parameter resolution.

Is the type check even valid without that cast?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we just inline the non-recursive calls until we see a t.asInstanceOf[thiss.asInstanceOf[F].T] receiver?

Actually, in which situations are we calling targetsOn? I would have thought that it wouldn't be invoked at all without the test method in the original benchmark. Which effect are we trying to resolve when it's absent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call it only in the situation mentioned in the second comment above (see the part of comment that goes "At the call site of c_ we try to ..."). There's no inlining happening there (this is in computeExprEffects rather than computeTargets) -- but perhaps you are suggesting to add inlining here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about this line?

If I understand the code correctly, we enter that case when we've found a concrete effect that occurs in the body of the callee (in this case, c_ has an effect on t_.x). We then try to resolve this effect on the provided argument (i.e. thiss).

It seems to me that if we find an effect for the function parameter, we should correctly encode all the information we need to resolve it on the provided argument. This might include a path condition and possibly some casts if necessary. I thought we had support for this anyway? Don't we need those to support effects on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we do (minus the casts): that's essentially Target. The existing code mostly projects these down to Effects, which drop the path condition, however. As far as I can tell, the current situation is that whenever we deal with a function invocation we take these as an approximation of the function's effects, i.e., we disregard the path conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think the rationale is that updating the aliased variable is fine even when the path condition doesn't hold (the update will just be a noop). This is probably necessary to deal with recursion. There is some support for inferring a minimal path condition while updating ADT fields when we might be given a different constructor instance: see mapApplication.

That code is fairly horrible, though... and I'm not sure it will translate well to type defs where the instance-of needs to be applied to a different argument (thiss instead of t).

@@ -336,6 +367,9 @@ trait EffectsAnalyzer extends oo.CachingPhase {
And(Not(cnd).setPos(cnd), e.setPos(cnd)).setPos(cnd)
} getOrElse(Not(cnd).setPos(cnd))

// FIXME: This seems wrong: why are we ignoring t.condition?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romac: bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should likely be conjoined with cnd.

Comment on lines +332 to +334
// This function provides an under-approximation in some cases; in particular, it might return
// an empty set for certain "non-local" constructs like function invocations. These seem to be
// handled specially elsewhere.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Regis' approach to aliasing, function invocations never return an aliased value (it is always fresh). We relaxed this to allow aliasing in non-recursive functions, but left is disabled in recursive ones.

I thought we used to check this constraint in the EffectsChecker, but I can't seem to find the corresponding code. @romac any idea where it lives?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you're referring to this? The recursive case will fall through to return Set.empty. It's a bit hard to reason about whether we always deal with the empty set correctly, since computeTargets (i.e. formerly getEffects) gets used in a few places and there are also the more and less strict versions computeExactTargets and computeKnownTargets. The strict version (which turns an empty set into an exception) is used with array updates, mutable map updates and field assignments in AntiAliasing. Does that cover what you had in mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that an empty set means nothing will be mutated. Since we assume that function invocation results are always fresh (i.e. contain no aliased values), we can safely return an empty set of targets since mutating a function invocation result won't affect any state. I do agree that computeExactTargets and computeKnownTargets are quite confusing, it would be worth cleaning those up. I'm not sure we actually need them.

What I meant about the EffectsChecker constraint is that although we assume here that function results are always fresh, we don't seem to actually enforce this anywhere. I seem to recall we had this check at some point but I can't find it anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a very useful remark.

Looking at a concrete case where the empty set case is differentiated from the non-empty case, I'm still a bit puzzled though. That particular piece of code translates Lets during AntiAliasing. If we know the exact effects of the let-binding's value, we can apparently conclude that that value won't change any further -- that is, we translate it as a Let and simply rewrite all occurrences of the binding by the (translated) value expression.

On the other hand, if we find the set of effects to be empty (or, implicitly, if an error occurred), we translate the binding as a LetVar and allow it to be mutated later on. That's somewhat surprising to me. What kind of mutation could be applied to that binding that would be detected by computeTargets, but wouldn't already have been encoded in the (translated) value expression of the binding?

(I realize I might be asking a very implementation-specific question here, so don't feel obliged to do any digging, if it's not obvious to you either.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that an empty set means nothing will be mutated.
Without qualification, that's not really true, though, because function invocations might absolutely mutate their arguments, but we return the empty set for non-trivial (e.g. recursive) function invocations.

I think what's really meant by "effect" in this phase is "observable effects on the environment". That is,
a) we ignore effects on things that can't be referred to from the current scope (through some series of selections), and
b) we also ignore effects on things that are in scope, but have been consumed by some operations (chiefly, function invocations -- not sure if there are others). (Through the lens of a linear type system, these things arguably weren't in scope any longer, making this a special case of a).)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at a concrete case where the empty set case is differentiated from the non-empty case, I'm still a bit puzzled though.

For let-bindings with mutable type, we're dealing with the following two cases:

  1. We can compute the exact alias on the rhs of the let-binding. In that case, we can support this simple form of aliasing by replacing all occurrences of the bound variable with the set of computed targets.
  2. The rhs of the let-binding is fresh, which is (possibly wrongly) captured by getKnownEffects(lhs).isEmpty. In this case, we can replace the let-binding by a let-var and rewrite effects on parts of the binding to apply to the let-var itself (e.g. x.a.b = 1 will be rewritten to x = A(B(1))).

It seems to me that the case where we couldn't compute exact aliasing information and the rhs is not fresh isn't handled here. Maybe that case was ruled out in one of the previous checks? I agree it would make sense to improve the computeTargets function to have a clearer API. I'm not sure why ADT(...) is considered malformed (when the path head doesn't match some field) but not Array(...). Ideally, the computeTargets function would correctly capture the aliasing/freshness/unsupported state of its input.

I think what's really meant by "effect" in this phase is "observable effects on the environment".

Ah, I wasn't clear with what I meant by "nothing will be mutated". I meant that given some expression 'E[t]' where 'E[.]' applies some mutation on 't', if computeTargets(t) is empty, then we can ignore the result of 't' when computing global and local mutations. The observable effects should be equivalent to 'E[deep_copy(t)]'.
You can't ignore 'E[.]' or 't' when computing the actual effects, but you can ignore the effect of 'E[.]' on 't'.

(This might be the same thing you're saying :) getting a good mental image of effects is always a bit difficult.)

Copy link
Contributor Author

@gsps gsps Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry for the wall of text. Been appending to this since yesterday night.)

The rhs of the let-binding is fresh, which is (possibly wrongly) captured by getKnownEffects(lhs).isEmpty.

We also have isExpressionFresh in EffectChecker which seems to be another way of determining fresh-ness. Not sure if that one would be more appropriate.

I agree it would make sense to improve the computeTargets function to have a clearer API. (...)

I've been toying with some different representations for the result of computeTargets for the last two days. The way I now think about computeTargets(expr) is that it should compute a mapping effAt : Expr => Eff from path conditions to Eff, where Eff can be one of three things:

  • "no effect" (when there is in fact no effect under that path condition)
  • "unobservable effect" (when there is an effect, but it occurs on a fresh value)
  • "observable effect on receiver#path" (basically an instance of what is called Effect now).

The way it is actually implemented right now only keeps track of the latter (cumulatively represented as a Set[Target]), and conflates everything else into Set.empty (or raises an exception, if it encounters unsupported code).

I'm not completely sure whether it's worth explicitly representing the first two cases. But looking through the various cases where we combine targets, I noted that the behavior of computeTargets is slightly dubious:

  • For Ifs we seem to end up with the empty set, if either of the then or the else branch yields the empty set.
  • For Lets we end up with the empty set, if either of the binding's value or the body yields the empty set. For Lets, this immediately leads to an exception, though.

So, to summarize the current situation: Even without computeKnownTargets (which turns exceptions into the empty set), computeTargets might return the empty set as some sort of "unknown" sentinel value, and definitely does not just indicate "fresh" values.


I think the only reason why this doesn't break lots of tests is that we usually go through computeExprEffects, which actually handles these cases (e.g., Ifs) conservatively (taking the union of all effects). Also, in AntiAliasing, we only use targets in two other ways: computeExactTargets to find out which targets are affected by a field assignment, array-, or map-update. This usage is safe, since it actually treats empty sets as an error. computeKnownTargets is only used when rewriting Let bindings, as we discussed earlier. So the addendum to your above explanation is that in case 2.) the value might in fact not be fresh.

Speaking of computeExprEffects, there's also some strange behavior:

  • For Lets we seem to randomly pick an effect from the binding's value when extending the environment, because (effect(e, env): Set[Effect]).map(vd.toVariable -> _) will create key value pairs with the same key, so only the last one will survive.

Now based on these insights, I managed to find some new unsound examples:

  • In the first example, function f below, the behavior of computeTargets on If lets us create bindings that are ostensibly effect-free. We can incorrectly assert that f is pure (and AntiAliasing will not rewrite the function to return an updated b).
  • In the other two examples, functions g and h below, lead to crashes during ImperativeCodeElimination, because we try to Assign to bindings that are Lets, not LetVars. The underlying issue has to do with both the handling of Let in AntiAliasing and with the bug in computeExprEffects mentioned above, I believe.
import stainless.annotation.pure

case class Box(var length: Int)

object Foo {
  def makeBox(): Box = Box(0)

  // Passes and omits effect on b
  // (Notice that this function is in fact *not* pure!)
  @pure def f(b: Box): Unit = {
    val c =          // => Set.empty!
      if (true) b    // Set(Target(b, ...))
      else makeBox() // Set.empty
    c.length = 1     // Just affects c locally, and leaves b unchanged
  }

  def mutate(b: Box): Unit = { b.length = 123; }

  // Crash in ImperativeCodeElimination and "duplicates" effect on both b1 and b2
  def g(b1: Box, b2: Box): Unit = {
    val c = if (true) b1 else b2
    mutate(c)
    // The above call leads AntiAliasing to produce nonsensical conditions for mutating b1 and b2:
    // > var res: (Unit, Box) = mutate(if (true) b1 else b2)
    // > (if (res._2.isInstanceOf[Box]) {
    // >   b1 = @unchecked Box(@unchecked res._2.asInstanceOf[Box].length).asInstanceOf[Box]
    // > } else {
    // >   ()
    // > })
    // > (if (res._2.isInstanceOf[Box]) {
    // >   b2 = @unchecked Box(@unchecked res._2.asInstanceOf[Box].length).asInstanceOf[Box]
    // > } else {
    // >   ()
    // > })
  }

  // Crash in ImperativeCodeElimination and omits effect on b1/b2
  def h(b1: Box, b2: Box): Unit = {
    val c = if (true) b1 else b2
    val d = c // Interacts with Let handling `computeExprEffects`
    // We try to assign to val c (and nothing else!)
    mutate(d)
  }
}

@jad-hamza
Copy link
Contributor

I had forgotten about this PR, I guess it has many useful changes that could be included?

@vkuncak vkuncak added the aliasing Alias and effect analysis for imperative label May 20, 2021
@vkuncak vkuncak added the old Work that is substantially behind master label May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aliasing Alias and effect analysis for imperative old Work that is substantially behind master
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants