py/vm: Add branch-point VM switching for dual-VM settrace.#23
py/vm: Add branch-point VM switching for dual-VM settrace.#23andrewleech wants to merge 3 commits intoreview/py-settrace-dual-vmfrom
Conversation
|
The webassembly CI was failing because commit f6f572b ("webassembly/api: Fix CLI.") changed the stdin check from When the test runner does Fixed by adding |
Current-frame tracing: CPython compatibility and enhancement optionsWith dual-VM, If we want to go beyond CPython's default in the future, there are a few options: Option A — Auto-propagate frames on settrace: Walk the Option B — VM switch at branch points: Piggyback on the standard VM's Option C — Expose Option D — No changes (current recommendation): Current behaviour is CPython-compatible. Revisit if a concrete debugger use case requires it. These are documented in |
Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
When sys.settrace(callback) is called while a function is already running in the standard VM, switch to the tracing VM at the next branch instruction rather than waiting for the next function call. Adds MP_VM_RETURN_SWITCH_VM to the return kind enum. The dispatcher in vm_outer.c loops on this value to re-select the VM. The switch triggers at pending_exception_check in vm.c (after jumps, conditionals, for-iter). The check placement is asymmetric for zero hot-path overhead in the standard VM: settrace() bumps sched_state to MP_SCHED_PENDING, which triggers the existing slow-path block where the prof_trace_callback check lives. The tracing VM has the reverse check on its hot path, which is acceptable since tracing already has per-instruction overhead. A vm_switch_pending flag in mp_state_vm_t keeps sched_state at PENDING (via mp_sched_unlock) until the standard VM actually switches, preventing nested function calls from consuming the signal. The dispatcher also checks prof_callback_is_executing to avoid selecting the standard VM while a trace callback is executing, which would hit an assert in FRAME_ENTER. With GIL threading, the calling thread holds the GIL from settrace() through to pending_exception_check, so it always processes its own signal. On non-GIL threaded ports, another thread may consume the sched_state signal first, degrading to function-boundary switching. Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
Tests mid-function VM switching: enable/disable mid-loop, nested calls after branch, toggle on/off/on, exception handling, return value integrity, generator mid-iteration switch, and self-disabling trace callback. Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
c0861e5 to
db849c1
Compare
c3ca843 to
bf7cab6
Compare
|
Code size report: |
Summary
With the dual-VM settrace in micropython#18571, calling
sys.settrace()mid-function only takes effect at the next function call boundary. This means a long-running loop that doesn't call other functions won't start tracing until the loop exits. This PR makes settrace take effect at the next branch instruction (loop iteration or conditional jump) within the currently-executing function.Built on top of micropython#18571 as a suggestion - I may have misunderstood the exact limitation you described in our conversation, but this seemed worth exploring.
The standard VM has zero additional overhead from this change. settrace() bumps
sched_statetoMP_SCHED_PENDING, and the existing slow-path check (if (sched_state == PENDING || ...)) picks it up at the next branch instruction. No new reads on the fast path. The tracing VM checksprof_trace_callbackdirectly on its hot path, which is acceptable since tracing already has per-instruction overhead.A
vm_switch_pendingflag inmp_state_vm_tprevents nested function calls between settrace() and the caller's next branch point from consuming the sched_state signal viamp_sched_unlock. Without this, a function call made before the branch point clears PENDING and the switch never happens.The dispatcher in
vm_outer.calso checksprof_callback_is_executingbefore selecting the standard VM - this fixes an assert that fires when settrace(None) is called mid-function while the trace callback itself is executing.The webassembly/api.js commit is a separate fix for CI breakage in the base branch. Commit f6f572b ("webassembly/api: Fix CLI") changed the stdin guard from
process.stdin.isTTY === falseto!process.stdin.isTTY, which made it also matchundefined(non-TTY CI environments). This causesrunCLI()to block onfs.readFileSync(0)even when file arguments are provided. The fix adds&& replso stdin is only read when no file args were given and the REPL would otherwise start.For threaded builds with the GIL, the calling thread holds the GIL from settrace() through to the pending exception check, so it always processes its own signal. On non-GIL threaded builds there's a theoretical race where another thread could consume sched_state first, degrading to function-boundary switching.
Size cost on unix/coverage x86-64: +816 bytes .text, +128 bytes .bss.
Testing
8 test cases in
tests/misc/sys_settrace_midfunction.pycovering: enable/disable mid-loop, nested calls after branch, toggle on/off/on, exception handling, return value integrity, generator mid-iteration, and self-disabling trace callback. Tested on unix/coverage only.Note: the existing
sys_settrace_features.py,sys_settrace_generator.py,sys_settrace_loop.pyandsys_settrace_cov.pytests fail on the base dual-VM branch (before these commits). The cause is<listcomp>frame tracking - the dual-VM's standard VM copy maintains thecurrent_code_stateframe chain, so listcomps appear as separate call/return frames in trace output. This is actually correct for MicroPython's execution model (listcomps are separate bytecode functions) and matches CPython 3.11 behavior, but diverges from CPython 3.12+ which inlined listcomps (PEP 709). The tests have no.expfiles and are compared against the host CPython, which is 3.12 in CI.Trade-offs and Alternatives
The main trade-off is ~816 bytes of code size for a feature that may not be needed if function-boundary switching is sufficient. Alternatives considered during development:
prof_trace_callbackread on the hot path - adds overhead for all bytecode execution, not just settrace users.mp_sched_schedule- disproportionate overhead for a simple flag check.vm_switch_pendingapproach.Generative AI
I used generative AI tools when creating this PR, but a human has checked the code and is responsible for the description above.