Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions openhcl/openhcl_boot/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,15 @@ fn build_kernel_command_line(

// Only when explicitly supported by Host.
// TODO: Move from command line to device tree when stabilized.
//
// DEVNOTE (mattkur): Create a new command line flag to pass along that
// the host supports keep alive (even if there is vtl2 pool supported).
// e.g.:
// OPENHCL_NVME_KEEP_ALIVE=1 (legacy, let's remove this)
// OPENHCL_NVME_KEEP_ALIVE=host,privatepool (supported by host and we have private pool)
// OPENHCL_NVME_KEEP_ALIVE=nohost,privatepool (not supported by host but we have a private pool)
// OPENHCL_NVME_KEEP_ALIVE=host,noprivatepool (supported by host, but no private pool (KA will be disabled))
// OPENHCL_NVME_KEEP_ALIVE=disabled,{no,}host,{no,}privatepool (explicitly disabled by command line, rest of flags as above)
if partition_info.nvme_keepalive && vtl2_pool_supported && !disable_keep_alive {
write!(cmdline, "OPENHCL_NVME_KEEP_ALIVE=1 ")?;
}
Expand Down
15 changes: 14 additions & 1 deletion openhcl/underhill_core/src/nvme_manager/device.rs
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ impl CreateNvmeDriver for VfioNvmeDriverSpawner {
} else {
AllocationVisibility::Private
},
persistent_allocations: save_restore_supported,
persistent_allocations: save_restore_supported, // OK in restore-on-old-host, since this will correctly be `false`. But, ideally, we keep it as `true` so that we can save on a host without support and restore on a node that has it.
})
.map_err(NvmeSpawnerError::DmaClient)?;

Expand Down Expand Up @@ -131,6 +131,19 @@ impl CreateNvmeDriver for VfioNvmeDriverSpawner {
driver: nvme_driver,
}))
}

// DEVNOTE(mattkur): TODO: create a new trait fn that clears and resets state, in the case of
// a restore on a host that does not support restore.
//
// This will:
// 1) Create a DMA client, be sure to say that we require persistent memory (even though save_restore_supported will actually be false).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good for handling the happy-path scenario but just wondering for a case where DMA fails to create persistent memory connections, do we fail here? If yes, maybe we need to think of a better fallback path as the host here doesn't support save-restore.

IMO we may allow it to proceed without saving the device state and when it live migrates off to a host that supports keepalive and private pool, we may need to allow the restore to happen successfully for such scenarios as well. Let me know your thoughts here as well.

// 2) Reset the device (call `VfioDevice::new` to get a fresh handle, and make sure that you do it in a way such that Vfio calls FLR).
// 3) call dma_client.attach_pending_buffers().context("failed to restore allocations")?;
// 4) free all the allocations on the dma client.
// 5) close the handle to the device (oops, will probably FLR it *again*, but ... that's probably OK; it does multiply the time to be 250ms x 2 per device).
//
// Now device is in a clean state.
// ... after this ... the next calls to get a `namespace` object will first open a handle to the device (so NvmeManagerWorker::restore() can simply be done).
}

impl VfioNvmeDriverSpawner {
Expand Down
16 changes: 15 additions & 1 deletion openhcl/underhill_core/src/nvme_manager/manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,13 @@ impl NvmeManager {
}

/// Restore NVMe manager's state after servicing.
///
/// DEVNOTE(mattkur): If the environment says that save/restore is not supported, then
/// we are restoring on a host where we do not understand the state. We must
/// quiesce IO (reset the device), clear any allocations, and then create a new device.
///
/// The "nice" thing is that we can just be in an empty state after restore. This will cause
/// us to create new drivers and namespaces.
async fn restore(
worker: &mut NvmeManagerWorker,
saved_state: &NvmeSavedState,
Expand Down Expand Up @@ -431,6 +438,13 @@ impl NvmeManagerWorker {
}

/// Restore NVMe manager and device states from the buffer after servicing.
///
/// DEVNOTE(mattkur): If the environment says that save/restore is not supported, then
/// we are restoring on a host where we do not understand the state. We must
/// quiesce IO (reset the device), clear any allocations, and then create a new device.
///
/// The "nice" thing is that we can just be in an empty state after restore. This will cause
/// us to create new drivers and namespaces.
pub async fn restore(&mut self, saved_state: &NvmeManagerSavedState) -> anyhow::Result<()> {
let mut restored_devices: HashMap<String, NvmeDriverManager> = HashMap::new();

Expand All @@ -443,7 +457,7 @@ impl NvmeManagerWorker {
&self.context.driver_source,
&pci_id,
saved_state.cpu_count,
true, // save_restore_supported is always `true` when restoring.
true, // save_restore_supported is always `true` when restoring. (TODO(mattkur): no longer true).
Some(&disk.driver_state),
)
.await?;
Expand Down
3 changes: 3 additions & 0 deletions openhcl/underhill_core/src/nvme_manager/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -110,4 +110,7 @@ pub trait CreateNvmeDriver: Inspect + Send + Sync {
save_restore_supported: bool,
saved_state: Option<&nvme_driver::NvmeDriverSavedState>,
) -> Result<Box<dyn NvmeDevice>, NvmeSpawnerError>;

// DEVNOTE(mattkur): TODO: create a new trait fn that clears and resets state, in the case of
// a restore on a host that does not support restore.
}
1 change: 1 addition & 0 deletions openhcl/underhill_core/src/options.rs
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ pub struct Options {
pub no_sidecar_hotplug: bool,

/// (OPENHCL_NVME_KEEP_ALIVE=1) Enable nvme keep alive when servicing.
/// See note in openhcl_boot/src/main.rs and then change this type to an enum.
pub nvme_keep_alive: bool,

/// (OPENHCL_NVME_ALWAYS_FLR=1)
Expand Down
5 changes: 5 additions & 0 deletions openhcl/underhill_core/src/worker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1998,6 +1998,11 @@ async fn new_underhill_vm(
let nvme_manager = if env_cfg.nvme_vfio {
// TODO: reevaluate enablement of nvme save restore when private pool
// save restore to bootshim is available.
//
// DEVNOTE(mattkur): In the event that this UH is _restoring_ on a host that no
// longer supports nvme_keep_alive (but UH itself supports this, and thus
// has a private pool), private_pool_available will be true and save_restore_supported will be false.
//
let private_pool_available = !runtime_params.private_pool_ranges().is_empty();
let save_restore_supported = env_cfg.nvme_keep_alive && private_pool_available;

Expand Down