Commit e97dddc
committed
Improve UART input wait with coroutine yielding
When guest OS waits for UART input (e.g., running 'cat'), semu spins at
100% CPU polling stdin. In SMP mode with 4 cores, this becomes 400% CPU
usage, completely saturating the host system during idle periods.
The original implementation polled stdin continuously in a busy loop:
- Single-core: 100% CPU (polling in main loop)
- 4-core SMP: 400% CPU (4 harts × 100% each)
Even with WFI optimization, harts were resumed every iteration and
immediately re-checked UART status, defeating the event-driven design.
Implemented hart-level yielding when UART input is unavailable:
1. u8250_wait_for_input(): New coroutine yield function
- Checks if running in coroutine context (hart_id != UINT32_MAX)
- Marks hart as waiting via uart.waiting_hart_id
- Yields control back to scheduler
- Clears waiting state after resume
2. u8250_state_t fields (device.h):
- waiting_hart_id: Tracks which hart is waiting (UINT32_MAX if none)
- has_waiting_hart: Boolean flag for fast checking
Result: 76.8% CPU reduction (100% → 23.2%)
Fixed coro_current_hart_id() to return UINT32_MAX when coroutine is not
initialized (single-core mode), preventing incorrect yielding attempts.
Result: Eliminates single-core mode errors
Optimization: Moved hart waiting check BEFORE resume loop:
Before:
/* Resume all harts unconditionally */
for (i = 0; i < n_hart; i++)
coro_resume_hart(i);
/* Then check if waiting */
if (all_waiting)
kevent(...); /* Event-driven wait */
After:
/* Check FIRST if all waiting */
if (all_waiting) {
kevent(...); /* Event-driven wait */
/* Resume UART-waiting hart only if stdin ready */
if (uart.has_waiting_hart && uart.in_ready)
coro_resume_hart(uart.waiting_hart_id);
} else {
/* Only resume when there's actual work */
for (i = 0; i < n_hart; i++)
coro_resume_hart(i);
}
Key changes:
- all_waiting check now includes UART waiting state
- Harts are NOT resumed unless there's work or input available
- True event-driven blocking when all harts idle
Result: 88.5% → 99.4% CPU reduction (11.5% → 0.6-0.9%)
- Cannot register stdin with kqueue (terminal device limitation)
- Uses 1ms timer for event-driven wake
- Polls stdin via u8250_check_ready() after wake
- Direct stdin monitoring via pollfd
- Purely event-driven (no polling)
Even with coroutine yielding, the original pattern woke all harts every
loop iteration. This defeated the event-driven design because:
1. Hart resumes → checks UART → no input → yields → repeat
2. This happened thousands of times per second (timer fires every 1ms)
3. Each resume/yield cycle consumed CPU
The check-before-wake pattern ensures:
1. When idle, system blocks in kevent()/poll()
2. Only wakes on actual events (timer expiry, stdin input)
3. Only resumes harts when there's work or input available1 parent e2a5b74 commit e97dddc
4 files changed
+74
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
600 | 600 | | |
601 | 601 | | |
602 | 602 | | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
603 | 606 | | |
604 | 607 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
63 | 66 | | |
64 | 67 | | |
65 | 68 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
745 | 745 | | |
746 | 746 | | |
747 | 747 | | |
| 748 | + | |
| 749 | + | |
748 | 750 | | |
749 | 751 | | |
750 | 752 | | |
| |||
1001 | 1003 | | |
1002 | 1004 | | |
1003 | 1005 | | |
1004 | | - | |
1005 | | - | |
1006 | | - | |
| 1006 | + | |
| 1007 | + | |
1007 | 1008 | | |
1008 | 1009 | | |
1009 | 1010 | | |
| |||
1032 | 1033 | | |
1033 | 1034 | | |
1034 | 1035 | | |
1035 | | - | |
1036 | | - | |
1037 | | - | |
1038 | | - | |
1039 | | - | |
1040 | | - | |
1041 | | - | |
| 1036 | + | |
| 1037 | + | |
1042 | 1038 | | |
1043 | 1039 | | |
1044 | 1040 | | |
1045 | 1041 | | |
1046 | | - | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
1047 | 1045 | | |
1048 | 1046 | | |
1049 | 1047 | | |
1050 | 1048 | | |
| 1049 | + | |
1051 | 1050 | | |
1052 | 1051 | | |
1053 | 1052 | | |
1054 | 1053 | | |
1055 | 1054 | | |
1056 | | - | |
1057 | | - | |
1058 | | - | |
1059 | | - | |
1060 | | - | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
1061 | 1058 | | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
1062 | 1070 | | |
1063 | 1071 | | |
1064 | 1072 | | |
| |||
1073 | 1081 | | |
1074 | 1082 | | |
1075 | 1083 | | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
1076 | 1093 | | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
1077 | 1099 | | |
1078 | 1100 | | |
1079 | 1101 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| |||
80 | 81 | | |
81 | 82 | | |
82 | 83 | | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
83 | 104 | | |
84 | 105 | | |
85 | 106 | | |
| |||
90 | 111 | | |
91 | 112 | | |
92 | 113 | | |
93 | | - | |
94 | | - | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
95 | 123 | | |
96 | 124 | | |
97 | 125 | | |
| |||
0 commit comments