Define inlined, more-dumb versions of Mod_bounds ops #3605

glittershark · 2025-02-19T19:54:10Z

The mod-bounds ops that used polymorphic functions and first-class modules were
not only a little bit over-abstracted for some peoples' taste, they actually had
noticeably worse performance, due to suboptimal inlining behavior (and
allocation!) of the Accent_lattice FCM. This changes them to be dumber and
more repetitive, but more direct, which also gets us a few percent performance
win.

liam923

As we discussed in person, I disagree with the thesis that this code is "better". I think it is harder to read and maintain. As of now, I think it is also harder to reason about due to some of the axes being backwards (but I think we can fix this). However, it is clearly more performant, and these are hot paths. So I think this is worth it.

typing/jkind.ml

goldfirere

I don't really love the hybrid approach in this PR: keeping Mod_bounds as an Axis_collection, but then just inlining a few function calls. Would we get these same performance gains with [@inline]? Then we should do that. Or if the goal is to remove the Axis_collection abstraction, then go all the way and do it -- possibly recouping even more perf gains.

typing/jkind.ml

goldfirere · 2025-02-19T21:19:45Z

@liam923

As we discussed in person, I disagree with the thesis that this code is "better".

I had a long convesation with @lpw25 about this last week -- because he thinks this new simplified code is better, and I disagreed. I continue to find the old more abstract code easier to reason about than the new code, because I find the abstraction of Axis_collection to be conceptually straightforward, even with its parameterization with Indexed. I can learn that abstraction once, and then each use of a function in it is easy to understand in context.

But: this abstraction has been evolving. I know because I spent a bunch of time adding capabilities to it (e.g. Map2 and Fold). And I believe you recently added Indexed and the mono and poly variants. Folks who come across it for the first time have to spend a not-quite-trivial amount of time understanding what's going on and tracing through the meanings of the various pieces. On the other hand, if we had done the simple stuff to begin with, I think the net time spent would be significantly lower, even accounting for several times spending several minutes routinely updating the fields being affected. (Tuning the abstraction is much more fun than repetitively editing field names. We're not optimizing for fun, sadly.)

Until Leo made the point to me about this time cost, I pushed back about which code was better. I continue to think that, if this code were more at rest, Axis_collection makes it better. I think some of this just comes down to personal taste: at one point I used Option.bind because that just makes more sense to me than a direct match, but Leo found the code more confusing. Perhaps if the majority of the team prefers the simpler code, that really does make it better. In any case, though, in this case, I have to agree that going on without Axis_collection is the right call, just because of the time-cost of its maintenance.

(And then, of course, there's the performance aspect!)

glittershark · 2025-02-25T18:50:27Z

I've pushed a (trivial) rebase followed by another commit, 90881e1, which gets rid of Axis_collection entirely, at @goldfirere's suggestion. perf seems basically equivalent. @goldfirere and @liam923 PTAL

The mod-bounds ops that used polymorphic functions and first-class modules were not only a little bit over-abstracted for some peoples' taste, they actually had noticeably worse performance, due to suboptimal inlining behavior (and allocation!) of the `Accent_lattice` FCM. This changes them to be dumber and more repetitive, but more direct, which also gets us a few percent performance win.

Let's inhabit one side of this architectural fence entirely, rather than straddling it.

glittershark · 2025-02-25T18:53:34Z

and another rebase, to fix conflicts; the commit to review is now 4002964

typing/jkind.ml

goldfirere · 2025-02-25T19:16:22Z

typing/jkind.ml

+    create ~locality ~linearity ~uniqueness ~portability ~contention ~yielding
+      ~externality ~nullability
+
+  let less_or_equal t1 t2 =


I think I'd rather the annoying pattern-match here. Nothing stops this from accidentally forgetting an axis otherwise.

Hmm... I now see why this is hard. I don't have a great idea here. Maybe the best is to point here from the definition of Mod_bounds saying that any addition needs to be reflected? Ditto equal.

I realized I forgot to mention this in the commit message, but the choice to make these records abstract and only expose getters and setters was intentional, to pave the road for having the more efficient representation (bit fields) next.

I'm also concerned about forgetting new axes here. Here's a not-fun suggestion:

val match_axes : t -> (locality:Locality.t -> linearity:Linearity.t -> ... -> 'a) -> 'a

Hopefully this would get inlined the way I'd hope.

I think the eliminator approach is probably the nicest choice here, if it does in fact get inlined - but I am wary of doing it until after we do the bitfield refactor, because it's only then that we know if it's introduced noticeable pessimization. Do you mind if we wait until after that change to do this?

Fine by me, yes

goldfirere · 2025-02-25T19:17:03Z

typing/jkind.ml

+    @@ axis_less_or_equal ~le:Nullability.le ~axis:(Pack (Nonmodal Nullability))
+         (nullability t1) (nullability t2)
+
+  let equal t1 t2 =


Another good place for a pattern-match

see also #3605 (comment)

goldfirere · 2025-02-25T19:31:15Z

typing/jkind.ml

+            let value_for_axis (type a) ~(axis : a Axis.t) : a =
+              if Axis_set.mem relevant_axes axis
+              then
+                let (module Bound_ops) = Axis.get axis in


Can we do better here? I thought we identified this as an allocate-y operation which might be slow. And it's in a loop here. It's a bit more verbose, but https://github.com/goldfirere/flambda-backend/blob/rae/dismantle-axis-collection/typing/jkind.ml#L1654 has another way.

We could also make value_for_axis take the join operation as an argument.

this doesn't end up showing up on a profile, fwiw, for whatever reason. We could make it worse for consistency's sake, but I'm hesitant to?

My profile shows a decent amount of time in Accent_lattice. I think I will try to improve this.

liam923 · 2025-02-25T20:04:57Z

typing/jkind.ml

-                  A.meet base_modifier parsed_modifier.txt)
-          }
+        let value_for_axis (type a) ~(axis : a Axis.t) : a =
+          let (module A) = Axis.get axis in


This seems analogous to the situation in normalize

analogously, this doesn't really show up on a profile, so I'm hesitant to make it more verbose without good reason.

This is actually much faster, and has the same effect for lattice reasons

glittershark requested a review from liam923 February 19, 2025 19:54

liam923 reviewed Feb 19, 2025

View reviewed changes

typing/jkind.ml Outdated Show resolved Hide resolved

typing/jkind.ml Outdated Show resolved Hide resolved

glittershark force-pushed the aspsmith/with-kind-perf branch from 4e73b25 to 8536286 Compare February 19, 2025 20:14

glittershark force-pushed the aspsmith/dumber-mod-bounds-ops branch from 8cfbcc5 to ee44e52 Compare February 19, 2025 20:17

goldfirere reviewed Feb 19, 2025

View reviewed changes

typing/jkind.ml Outdated Show resolved Hide resolved

glittershark force-pushed the aspsmith/with-kind-perf branch from 8536286 to a0af476 Compare February 20, 2025 16:26

Base automatically changed from aspsmith/with-kind-perf to main February 20, 2025 20:24

glittershark force-pushed the aspsmith/dumber-mod-bounds-ops branch 2 times, most recently from 7aac3d7 to 90881e1 Compare February 25, 2025 18:45

glittershark added 2 commits February 25, 2025 13:50

Delete Axis_collection entirely

4002964

Let's inhabit one side of this architectural fence entirely, rather than straddling it.

glittershark force-pushed the aspsmith/dumber-mod-bounds-ops branch from 90881e1 to 4002964 Compare February 25, 2025 18:53

goldfirere reviewed Feb 25, 2025

View reviewed changes

some more inlining

9258c78

goldfirere mentioned this pull request Feb 25, 2025

Strip jkind_axis functions out of modules #3615

Closed

liam923 reviewed Feb 25, 2025

View reviewed changes

Use equal rather than less_or_equal to check max

177932d

This is actually much faster, and has the same effect for lattice reasons

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define inlined, more-dumb versions of Mod_bounds ops #3605

Define inlined, more-dumb versions of Mod_bounds ops #3605

glittershark commented Feb 19, 2025

liam923 left a comment

goldfirere left a comment

goldfirere commented Feb 19, 2025

glittershark commented Feb 25, 2025

glittershark commented Feb 25, 2025

goldfirere Feb 25, 2025

goldfirere Feb 25, 2025

glittershark Feb 25, 2025

liam923 Feb 25, 2025

glittershark Feb 25, 2025

goldfirere Feb 25, 2025

goldfirere Feb 25, 2025

glittershark Feb 25, 2025

goldfirere Feb 25, 2025

liam923 Feb 25, 2025

glittershark Feb 25, 2025

goldfirere Feb 25, 2025

liam923 Feb 25, 2025

glittershark Feb 25, 2025

Define inlined, more-dumb versions of Mod_bounds ops #3605

Are you sure you want to change the base?

Define inlined, more-dumb versions of Mod_bounds ops #3605

Conversation

glittershark commented Feb 19, 2025

liam923 left a comment

Choose a reason for hiding this comment

goldfirere left a comment

Choose a reason for hiding this comment

goldfirere commented Feb 19, 2025

glittershark commented Feb 25, 2025

glittershark commented Feb 25, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment