Bug Report: macmon GPU monitoring triggers SIGABRT in libmlx.dylib during active inference
Environment
- Hardware: Mac17,6 (MacBook Pro M5, 128GB)
- OS: macOS 26.4.1 (25E253)
- Python: 3.12.13 (Homebrew)
- Stack: Exo (
ai.hermes.gateway-ss coalition), oMLX paged inference, libmlx.dylib
- Triggered by: Running
macmon (the system monitor recommended in Exo's README) while MLX inference is active
What Happened
Launching macmon while Exo/oMLX is serving inference causes a hard crash in Python (PID 39036) with SIGABRT / Abort trap: 6.
Crash thread (Thread 52 — com.Metal.CompletionQueueDispatch):
mlx::core::gpu::check_error(MTLCommandBuffer)
→ cxa_throw (C++ exception thrown inside Metal async dispatch boundary)
→ std::terminate() (exception cannot propagate across dispatch queue)
→ abort()
→ SIGABRT
Full stack:
0 libsystem_kernel.dylib pthread_kill
1 libsystem_pthread.dylib pthread_kill
2 libsystem_c.dylib abort
3 libc++abi.dylib abort_message
4 libc++abi.dylib demangling_terminate_handler
5 libobjc.A.dylib objc_terminate
6 libc++abi.dylib std::terminate()
7 libc++abi.dylib __cxa_throw
8 libmlx.dylib mlx::core::gpu::check_error(MTLCommandBuffer*)
9 libmlx.dylib (Metal CompletionHandler block)
10 Metal -[MTLCommandBuffer didCompleteWith...]
11 IOGPU IOGPUNotificationQueueDispatchAvailableCompletionNotifications
Root Cause
macmon reads Apple Silicon performance counters and GPU utilization metrics via IOKit/IOGPUFamily — the same interfaces Metal uses internally. When macmon samples the GPU concurrently with an in-flight Metal command buffer (during MLX inference), the GPU returns an error state that mlx::core::gpu::check_error detects and attempts to throw as a C++ exception. Because this happens inside a GCD async dispatch block (com.Metal.CompletionQueueDispatch), the exception cannot propagate, triggering std::terminate() → abort().
This is not a user error — Exo's own documentation and README point users toward macmon as the recommended Apple Silicon monitoring tool. Users following those instructions will hit this crash reliably on M3/M4/M5 systems running active inference.
Reproduction Steps
- Start Exo with MLX inference engine on Apple Silicon (M3/M4/M5)
- Load a model and begin inference (active Metal command buffers in flight)
- Launch
macmon in a separate terminal
- Observe Python crash with
SIGABRT — Abort trap: 6
Recommendation: Replace macmon with mactop
mactop is a drop-in alternative that:
- Uses the same
sysinfo / powermetrics data sources as Activity Monitor
- Does not directly query IOGPUFamily in a way that interferes with active Metal sessions
- Has been used extensively on the same M5 hardware alongside active MLX inference with zero crashes
Suggested README change:
macmon → mactop for real-time Apple Silicon GPU/ANE/CPU monitoring during Exo inference
Install:
brew install mactop
sudo mactop # requires sudo for power metrics
Impact
- Any user following Exo's documented toolchain on Apple Silicon who uses
macmon for monitoring while running inference is at risk of hard Python crashes mid-session
- Crash is non-recoverable (process terminates, active inference context lost)
- No warning or graceful error — just
Abort trap: 6
Crash Report Excerpt (Thread 52 — crashing thread)
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Termination Reason: Namespace SIGNAL, Code 6, Abort trap: 6
Triggered by Thread: 52, Dispatch Queue: com.Metal.CompletionQueueDispatch
Thread 52 Crashed:
0 libsystem_kernel.dylib pthread_kill + 8
1 libsystem_pthread.dylib pthread_kill + 296
2 libsystem_c.dylib abort + 148
3 libc++abi.dylib abort_message + 132
4 libc++abi.dylib demangling_terminate_handler + 272
5 libobjc.A.dylib objc_terminate + 172
6 libc++abi.dylib std::terminate() + 16
7 libc++abi.dylib __cxa_throw + 92
8 libmlx.dylib mlx::core::gpu::check_error(MTLCommandBuffer*) + 244
9 libmlx.dylib (Metal completion handler block)
Coalition: ai.hermes.gateway-ss | Hardware: Mac17,6 | macOS 26.4.1
Bug Report: macmon GPU monitoring triggers
SIGABRTinlibmlx.dylibduring active inferenceEnvironment
ai.hermes.gateway-sscoalition), oMLX paged inference,libmlx.dylibmacmon(the system monitor recommended in Exo's README) while MLX inference is activeWhat Happened
Launching
macmonwhile Exo/oMLX is serving inference causes a hard crash in Python (PID 39036) withSIGABRT/Abort trap: 6.Crash thread (Thread 52 —
com.Metal.CompletionQueueDispatch):Full stack:
Root Cause
macmonreads Apple Silicon performance counters and GPU utilization metrics via IOKit/IOGPUFamily — the same interfaces Metal uses internally. When macmon samples the GPU concurrently with an in-flight Metal command buffer (during MLX inference), the GPU returns an error state thatmlx::core::gpu::check_errordetects and attempts to throw as a C++ exception. Because this happens inside a GCD async dispatch block (com.Metal.CompletionQueueDispatch), the exception cannot propagate, triggeringstd::terminate()→abort().This is not a user error — Exo's own documentation and README point users toward
macmonas the recommended Apple Silicon monitoring tool. Users following those instructions will hit this crash reliably on M3/M4/M5 systems running active inference.Reproduction Steps
macmonin a separate terminalSIGABRT—Abort trap: 6Recommendation: Replace macmon with mactop
mactop is a drop-in alternative that:
sysinfo/powermetricsdata sources as Activity MonitorSuggested README change:
Install:
brew install mactop sudo mactop # requires sudo for power metricsImpact
macmonfor monitoring while running inference is at risk of hard Python crashes mid-sessionAbort trap: 6Crash Report Excerpt (Thread 52 — crashing thread)
Coalition:
ai.hermes.gateway-ss| Hardware: Mac17,6 | macOS 26.4.1