fix: EXC-1992: Fix same round install code and draining queues #4346

berestovskyy · 2025-03-12T16:19:38Z

The current scheduler logic permits advance_long_running_install_code to complete a lengthy install process and then immediately initiate a new one within drain_subnet_queues.

This PR uses the can_execute_subnet_msg function to differentiate between heavy subnet messages, which must respect the round limit, and light subnet messages, which can be executed at any time.

Fixes RUN-386

The current scheduler logic permits `advance_long_running_install_code` to complete a lengthy install process and then immediately initiate a new one within `drain_subnet_queues`. This PR uses the `can_execute_subnet_msg` function to differentiate between heavy subnet messages, which must respect the round limit, and light subnet messages, which can be executed at any time. Fixes RUN-386

dsarlis

Approving since the issue is fixed. Both comments are non-blocking and can be followed up on later if there's agreement.

dsarlis · 2025-03-13T15:45:16Z

rs/execution_environment/src/scheduler.rs

    match method {
        // Only one install code message allowed at a time.
        Ic00Method::InstallCode | Ic00Method::InstallChunkedCode => {
-            !ongoing_long_install_code && !effective_canister_is_aborted
+            !instructions_reached && !ongoing_long_install_code && !effective_canister_is_aborted


This feels a bit brittle to me. Let's imagine that we add a new method, is it clear enough that someone would think about whether they need to make it a special case or they can just put it in the catch all match below?

We have the ic00_permissions which seems like a potentially promising approach to encode some of these requirements better. For example by adding a "heavy_operation" (or consuming instructions flag), maybe also one about "effective_canister_is_aborted". Probably that would require a rename to the module as these are not really permissions then. But I see a benefit in encoding in one place these requirements which would also mean that we make it easier for people to add all restrictions related to management calls (whether it's who can call them, or if they consumed instructions).

Given that this would likely be a bigger refactoring and assuming we see value, we can follow up later, I do not consider it a blocker.

dsarlis · 2025-03-13T15:48:22Z

rs/execution_environment/tests/dts.rs

+
+#[test]
+#[cfg(not(all(target_arch = "aarch64", target_vendor = "apple")))]
+fn yield_for_dirty_pages_copy_works_for_install_code() {


IIUC, the problem would manifest even in case of long install code due to many instructions used, followed up by another install code immediately. Should the tests be a bit more general to cover that instead of focusing on the case of too many dirty pages touched? Is my understanding wrong?

github-actions bot added the fix label Mar 12, 2025

berestovskyy force-pushed the andriy/exc-1992-fix-same-round-install-and-drain branch from d8ca6a8 to f8b75cc Compare March 12, 2025 19:23

berestovskyy changed the title ~~fix: EXC-1992: Fix same round advancing install code and draining queues~~ fix: EXC-1992: Fix same round install code and draining queues Mar 12, 2025

berestovskyy force-pushed the andriy/exc-1992-fix-same-round-install-and-drain branch 2 times, most recently from 4035f66 to ad85db5 Compare March 12, 2025 22:08

berestovskyy force-pushed the andriy/exc-1992-fix-same-round-install-and-drain branch from ad85db5 to b1a77b0 Compare March 13, 2025 05:17

berestovskyy marked this pull request as ready for review March 13, 2025 09:06

berestovskyy requested a review from a team as a code owner March 13, 2025 09:06

github-actions bot added the @execution label Mar 13, 2025

dsarlis approved these changes Mar 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: EXC-1992: Fix same round install code and draining queues #4346

fix: EXC-1992: Fix same round install code and draining queues #4346

berestovskyy commented Mar 12, 2025 •

edited

Loading

dsarlis left a comment

dsarlis Mar 13, 2025

dsarlis Mar 13, 2025

fix: EXC-1992: Fix same round install code and draining queues #4346

Are you sure you want to change the base?

fix: EXC-1992: Fix same round install code and draining queues #4346

Conversation

berestovskyy commented Mar 12, 2025 • edited Loading

dsarlis left a comment

Choose a reason for hiding this comment

dsarlis Mar 13, 2025

Choose a reason for hiding this comment

dsarlis Mar 13, 2025

Choose a reason for hiding this comment

berestovskyy commented Mar 12, 2025 •

edited

Loading