-
Notifications
You must be signed in to change notification settings - Fork 161
RFC: sidecar: allow openhcl_boot to specify a state per-cpu, and have openhcl_boot set that state for VPs that have mapped interrupts #2477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…hcl_boot set that state for VPs that have mapped interrupts
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a mechanism for openhcl_boot to specify per-CPU startup state, allowing it to control which CPUs are started by the sidecar kernel versus the main kernel. This optimization enables CPUs with mapped interrupts to be started immediately by the main kernel, avoiding unnecessary transitions.
Key changes:
- Adds
PerCpuStatestructure to track which CPUs should be started by sidecar - Updates sidecar initialization to respect per-CPU state overrides when specified
- Modifies the boot process to identify CPUs with mapped interrupts and keep them in the main kernel
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| openhcl/sidecar_defs/src/lib.rs | Defines PerCpuState structure and adds it to SidecarParams |
| openhcl/sidecar/src/arch/x86_64/init.rs | Updates sidecar init to apply per-CPU state overrides to cpu_status array |
| openhcl/openhcl_boot/src/sidecar.rs | Modifies sidecar startup and kernel command line generation to use per-CPU state |
| openhcl/openhcl_boot/src/main.rs | Initializes sidecar_cpu_overrides in test code |
| openhcl/openhcl_boot/src/host_params/mod.rs | Adds sidecar_cpu_overrides field to PartitionInfo |
| openhcl/openhcl_boot/src/host_params/dt/mod.rs | Identifies CPUs with mapped interrupts and configures them to stay in main kernel |
| /// Whether the CPU should be started in the sidecar kernel. If false, | ||
| /// the CPU will remain in the main kernel. |
Copilot
AI
Nov 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states "the CPU will remain in the main kernel" when sidecar_starts_cpu is false, but this is backwards. When sidecar_starts_cpu[i] is false, the CPU is NOT started by sidecar, meaning it stays with the main kernel. When it's true, sidecar starts the CPU.
The current wording suggests false means "remain in main kernel" (correct) but the field name sidecar_starts_cpu being false should mean "sidecar does NOT start this CPU" which is the same thing. However, the comment could be clearer. Consider:
/// Whether the CPU should be started by the sidecar kernel. If false,
/// the CPU will be started by (or remain with) the main kernel instead.
| /// Whether the CPU should be started in the sidecar kernel. If false, | |
| /// the CPU will remain in the main kernel. | |
| /// Whether the CPU should be started by the sidecar kernel. If false, | |
| /// the CPU will be started by (or remain with) the main kernel instead. |
| if initial_state.per_cpu_state_specified { | ||
| // If per-CPU state is specified, make sure to explicitly state that | ||
| // sidecar should not start the base vp of this node. | ||
| initial_state.sidecar_starts_cpu[base_vp as usize] = false; |
Copilot
AI
Nov 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing bounds check before indexing into sidecar_starts_cpu array. The value of base_vp can be up to MAX_CPU_COUNT (2048), but NUM_CPUS_SUPPORTED_FOR_PER_CPU_STATE is only 1000. This can cause an out-of-bounds panic.
Consider adding a bounds check:
if initial_state.per_cpu_state_specified && (base_vp as usize) < initial_state.sidecar_starts_cpu.len() {
initial_state.sidecar_starts_cpu[base_vp as usize] = false;
}| initial_state.sidecar_starts_cpu[base_vp as usize] = false; | |
| if (base_vp as usize) < initial_state.sidecar_starts_cpu.len() { | |
| initial_state.sidecar_starts_cpu[base_vp as usize] = false; | |
| } |
| if !should_start { | ||
| cpu_status[i] = CpuStatus::REMOVED.0.into(); | ||
| } |
Copilot
AI
Nov 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The override logic only sets cpu_status[i] to REMOVED when should_start is false, but doesn't set it to RUN when should_start is true. This means that for CPU 0 (which is set to REMOVED by default on line 325), if should_start is true, it will remain REMOVED instead of being changed to RUN.
Consider changing the logic to:
cpu_status[i] = if should_start {
CpuStatus::RUN.0.into()
} else {
CpuStatus::REMOVED.0.into()
};| if !should_start { | |
| cpu_status[i] = CpuStatus::REMOVED.0.into(); | |
| } | |
| cpu_status[i] = if should_start { | |
| CpuStatus::RUN.0.into() | |
| } else { | |
| CpuStatus::REMOVED.0.into() | |
| }; |
| for (cpu_index, &starts) in self.0.per_cpu_state.sidecar_starts_cpu.iter().enumerate() { | ||
| if starts { | ||
| write!(f, "{}{}", comma, cpu_index)?; | ||
| comma = ","; | ||
| } | ||
| } |
Copilot
AI
Nov 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The boot_cpus command line generation iterates through all 1000 entries in sidecar_starts_cpu, but should only iterate through the actual number of CPUs that exist. This will generate incorrect boot_cpus values for configurations with fewer than 1000 CPUs.
Consider limiting the iteration to the actual CPU count:
let cpu_count = self.0.node_params.iter().map(|n| n.base_vp + n.vp_count).max().unwrap_or(0);
for (cpu_index, &starts) in self.0.per_cpu_state.sidecar_starts_cpu
.iter()
.enumerate()
.take(cpu_count as usize)
{
if starts {
write!(f, "{}{}", comma, cpu_index)?;
comma = ",";
}
}| .iter() | ||
| .max() | ||
| .expect("non-empty vector") as usize | ||
| <= sidecar_cpu_overrides.sidecar_starts_cpu.len() |
Copilot
AI
Nov 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comparison should use < instead of <= to ensure the maximum CPU ID fits within the array. With the current logic, if the max CPU ID is exactly 1000, it would be accepted, but indexing sidecar_starts_cpu[1000] would be out of bounds (the array has indices 0-999).
Change to:
if *cpus_with_mapped_interrupts
.iter()
.max()
.expect("non-empty vector") as usize
< sidecar_cpu_overrides.sidecar_starts_cpu.len()| <= sidecar_cpu_overrides.sidecar_starts_cpu.len() | |
| < sidecar_cpu_overrides.sidecar_starts_cpu.len() |
| comma = ","; | ||
| if self.0.per_cpu_state.per_cpu_state_specified { | ||
| let mut comma = ""; | ||
| for (cpu_index, &starts) in self.0.per_cpu_state.sidecar_starts_cpu.iter().enumerate() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self, this is backwards
| cpus, | ||
| sidecar_cpu_overrides: PerCpuState { | ||
| per_cpu_state_specified: false, | ||
| sidecar_starts_cpu: [false; sidecar_defs::NUM_CPUS_SUPPORTED_FOR_PER_CPU_STATE], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: just default these to true (since that's closer to right)
DO NOT MERGE, TESTING IN PROGRESS
Experiment: ask the kernel to online any CPUs that will immediately get interrupts mapped, and ask sidecar
to online the rest.