Runtime Analysis replacing static Dataflow analysis #14633

hubertp · 2026-01-14T15:56:07Z

Pull Request Description

!DRAFT PR/WIP!

This is an experiment in replacing Dataflow analysis with runtime tracking.
(compared to other attempt) This change tracks/caches only results of variable assignments. The tracking of dependencies is only performed during instrumentation/interactive mode thus not affecting batching execution.

In a nutshell the problem can be simply observed with an example of

x = 4
y = x + 1
z = 10
a = z + y
a

where we want to collect runtime information of local variables

x depends on nothing
y depends on x
z depends on nothing
a depends on z and y

That way, if x changes, then only y and a get re-executed.

Ref is an object that wraps every instrumented RHS value of an AssignmentNode. As a result AssignmentNode sets Ref objects to the VirtualFrame slot rather than the value itself. Whenever instrumented ReadLocalVariableNode gets value from the frame it gets a Ref object, allowing to register a dependency between variables. The code above would roughly translate during instrumentation to

x = Ref(4)
y = Ref(x.unwrap + 1)
z = Ref(10)
a = Ref(z.unwrap + y.unwrap)
a.unwrap

The main logic that creates Ref objects and unwraps them when necessary is in onEnter/onReturnValue methods of IdExecutionInstrumentation.

Care had to be taken in order to properly track runtime dependencies between function calls, e.g.,

main =
    x = foo
    ...

foo =
    a = ....
    a

makes sure that x is a dependency on a.

As Ref is an object that is only manipulated during instrumentation/invalidation it should not be present in any of Truffle nodes.

For the correct instrumentation and handling of dependency tracking nodes (AssignmentNodes, ReadLocalVariableNode) we make sure that each has a unique identifier. As every expression has a (lazily-generated) UUID, we had to enhance it with additional information for the correct identification of instrumentable and cache-ready nodes.
Every expression node is assigned a RuntimeID that apart from the existing UUID informs if the UUID has been assigned externally (via GUI, ExternalUUID) or internally (via a randomly generated UUID, InternalUUID). Additionally RuntimeID carries information if the given expression should be cached, based on its position in the program tree representation.

Note that both ExternalUUID and InternalUUID may refer to a cached node but not all ExternalUUID are always cached (GUI assigns UUID to many expressions). This covers scenarios where many of our tests do not explicitly set UUID for all expressions but also situations where GUI might be missing those as well (uncommon but still possible). We want runtime tracking to continue to work in all such scenarios.

Only RHSs of local assignments are being cached (compared to additional self arguments in the existing solution).
When upserting visualizations, visualizations are being evaluated ASAP, rather than only whenever the corresponding expression was executed. Most of the time such evaluation should not require write lock on compilation, thus reducing the time to user feedback (TTUF). Additionally, whenever a cached value is already available, visualization is being executed with the cached value immediately, reducing the TTUF as well. If no cached value is present a fresh execution is scheduled.
Re-execution still executes from the beginning of a given program. Most of the time, whenever RHSs of assignments are cached, the execution of expressions is therefore immediate. Assignments need to be re-executed in order to initialize frames' slots.
Visualizations for subexpressions is problematic with the new approach (e.g. widgets) and common enough not to be ignored. The solution is for every visualization to always identify a parent expression. An additional map will carry information about such (subexpression) visualizations. If during execution of an assignment a RHS is not cached, a subexpression and all associated visualizations will be executed, as usual. However if RHS is cached but has subexpressions with not yet executed visualizations, it is still fully executed without updating or invalidating RHS or its dependencies. This diverges from to the current Dataflow analysis approach which would often trigger excessive cache invalidations and re-executions on new visualization requests.
Care had to be taken in order to execute only 1(!) ExecuteJob at a time. If parallel execution is enabled, Truffle instrumentation appears to execute (Probe-) Nodes sometimes in an out-of-order fashion, making the logic that tracks dependencies unpredictable.
ChangesetBuilder had to be adapted in order to correctly identify RHSs that need to be invalidated on edits. Similarly IdMaps had to be taken into account, as revealed by failing tests.
Currently every context has its own RuntimeCache. This slightly complicates the dependency tracking between function calls as instrumentation has to be able to still make a dependency between return value of a function call and caller.
RefInvalidation encapsulates a logic for invalidating cached RHSs. It takes a special care to invalidate caches between contexts (every function call has its own runtime cache). This is related to the fact that care had to be taken to only send visualizations/expression updates whenever we are in the right context, as revealed by some tests in RuntimeVisualizationsTest.
Mostly tested with real/big projects and unit testing in RuntimeVisualizationsTest/RuntimeServerTest.

There are still failing tests/broken logic so this is very much WIP.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

The documentation has been updated, if necessary.
Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
All code follows the
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
Unit tests have been written where possible.
If meaningful changes were made to logic or tests affecting Enso Cloud integration in the libraries,
or the Snowflake database integration, a run of the Extra Tests has been scheduled.
- If applicable, it is suggested to paste a link to a successful run of the Extra Tests.

Distinguishing UUID into External and Internal to allow to better control (required) dependencies.

Tracking runtime dependencies no longer affects regular execution*. All tracking is done in instrumentation by carefully wrapping and unwrapping Ref objects at specific nodes. *writing/reading from a frame slot is limited to objects and not optimized for longs

Rather than associating visualizations with either RuntimeCache or Ref entries, fallback to the old logic using synchronization. This is due to the fact on how instrumentation collects node information when entering some non-top context. It was impossible to track when to send visualization updates when popping context without a separate synchronization map which now finally made sense. 5 tests are failing, mostly due to subexpression invalidation (or rather lack of it). This will be introduced in the next PR. RuntimeAnalysis should be somehow accessible via state but that is currently problematic. This will be resolved eventually.

1. Updating IdMap in a subexpression now invalidates the whoe RHS expression. 2. RHS with an explicit type ascription will NOT be cached because a) the ascribed expression shares the same external ID with type ascription b) (more importantly) `TypeCheckExpressionNode` is currently treated specially as an `ExpressionNode` and is NOT instrumentable. Maybe we can mitigate those 2 limitations but for now, let's leave it as is.

All logic for building ref dependencies is now encapsulated in callbacks implementation. Limit exposure to invalid runtime dependency construction by limiting to 1 ExecuteJob.

1. Subexpressions execution Correctly detect when a parent expression needed to be updated and therefore invalidated, when re-executing subexpressions. That way we avoid spurious updates to expressions/visualizations 2. ChangesetBuilder would detect changes too precisely. ChangesetBuilder would identify the subexpression update (like applying an argument to partially applied function) but not the RHS of an expression. This led to stale computations. Now all changes are mapped to RHS, if possible.

We are now more precise

Many of tests require proper runtime analysis despite missing identifiers for RHSs. GUI could also be missing explicit IDs (by accident or on purpose) and runtime analysis should continue to work in such cases. Added an unfortunate hack to `ForceNode` when a Ref is not being unwrapped. A proper fix is still TBD.

Avoiding unintentional double caching.

Some of the persisted objects were equal despite reporting the opposite.

JaroslavTulach · 2026-01-28T14:34:29Z

Running sbt runtime-integration-tests/test locally shows following failures:
- RuntimeErrorsTest
  - should send updates when dataflow error changes in method
  - should send updates when dataflow error is resolved in method
- RuntimeVisualizationsTest
  - should run visualization error preprocessor
- RuntimeRecomputeTest
  - should recompute expressions invalidating all
  - should recompute expressions changing an execution environment
  - should recompute expression with expression configs
  - should recompute expression with expression configs in one-off execution
  - should recompute recursive method call with expression configs
  - should recompute method call returning Panic with expression configs
- RuntimeServerTest
  - should send updates when function body is changed
  - should obey the execute parameter of edit command
  - should send updates when the type is changed
  - should send updates when the method pointer is changed
  - should skip side effects when evaluating cached expression
  - should rename a project
  - should edit two local calls with cached self
- RuntimeRefactoringTest
  - should rename operator in main body
  - should edit file after renaming local
  - should rename qualified module method in main body
  - should rename to original name qualified module method in main body
  - should rename qualified module method in lambda expression
  - should rename unqualified module method in main body
  - should edit file after renaming module method
  - should rename operator if the same definition exists in different method body
- and more...
shouldn't we resolve the merge conflicts and let CI report the failures "objectively"?

JaroslavTulach · 2026-01-28T15:34:04Z

shouldn't we resolve the merge conflicts and let CI report the failures "objectively"?

b4b8d27 is my attempt to merge
it can be removed, if incorrect - I am not sure about all the conflicts I resolved
with that commit I am getting:

sbt runtime-integration-tests/test
[error] Failed: Total 27356, Failed 87, Errors 0, Passed 27269, Ignored 29, Pending 30
[error] Failed tests:
[error]         org.enso.interpreter.test.instrument.RuntimeErrorsTest
[error]         org.enso.interpreter.test.instrument.RuntimeVisualizationsTest
[error]         org.enso.interpreter.test.instrument.RuntimeProgressTest
[error]         org.enso.interpreter.test.instrument.RuntimeRecomputeTest
[error]         org.enso.interpreter.test.instrument.WarningInstrumentationTest
[error]         org.enso.interpreter.test.instrument.RuntimeServerTest
[error]         org.enso.interpreter.test.instrument.IncrementalUpdatesTest
[error]         org.enso.interpreter.test.instrument.RuntimeRefactoringTest
[error]         org.enso.interpreter.test.instrument.RuntimeAsyncCommandsTest
[error]         org.enso.interpreter.test.instrument.AvoidIdInstrumentationTagTest
[error]         org.enso.interpreter.test.semantic.ExpressionIdTest
[error]         org.enso.interpreter.test.semantic.MethodsTest

hubertp added 13 commits November 20, 2025 15:25

Porting External/InternalUUID

00a40b5

Distinguishing UUID into External and Internal to allow to better control (required) dependencies.

WIP: explicit blocks for tracking in Assignment/ReadLocal-Node

ad2ad28

Added support for subexpressions

a99e62f

Merge branch 'develop' into wip/hubert/10525-visualizations

dd60db8

Merge branch 'develop' into wip/hubert/10525-visualizations

48ff098

wip: encapsulate loginc in callbacks

7aa3ae6

All logic for building ref dependencies is now encapsulated in callbacks implementation. Limit exposure to invalid runtime dependency construction by limiting to 1 ExecuteJob.

Minor update re expression updates

6fc1242

We are now more precise

Merge branch 'develop' into wip/hubert/10525-visualizations

f8f308d

hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label Jan 14, 2026

hubertp mentioned this pull request Jan 14, 2026

WIP: Replacing Static Dataflow Analysis with Reactive Caches #13948

Closed

5 tasks

nits

1257d34

hubertp added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Jan 14, 2026

hubertp added 4 commits January 15, 2026 12:18

Merge branch 'develop' into wip/hubert/10525-visualizations

ac96baa

Cleanups

125d657

Avoiding unintentional double caching.

obsolete

50e7e1b

Fix legit problems with persisted caches

4464351

Some of the persisted objects were equal despite reporting the opposite.

enso-bot bot mentioned this pull request Jan 22, 2026

Storing LocalDate in off-heap IntBuffer #14652

Merged

2 tasks

Merging with most recent develop

b4b8d27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime Analysis replacing static Dataflow analysis #14633

Runtime Analysis replacing static Dataflow analysis #14633

hubertp commented Jan 14, 2026 •

edited

Loading

Uh oh!

JaroslavTulach commented Jan 28, 2026 •

edited

Loading

Uh oh!

JaroslavTulach commented Jan 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Runtime Analysis replacing static Dataflow analysis #14633

Are you sure you want to change the base?

Runtime Analysis replacing static Dataflow analysis #14633

Conversation

hubertp commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

Checklist

Uh oh!

JaroslavTulach commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JaroslavTulach commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hubertp commented Jan 14, 2026 •

edited

Loading

JaroslavTulach commented Jan 28, 2026 •

edited

Loading

JaroslavTulach commented Jan 28, 2026 •

edited

Loading