Skip to content

Ovmf memory debug logging2 #10996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ajyoung-oracle
Copy link
Contributor

@ajyoung-oracle ajyoung-oracle commented Apr 23, 2025

Description

OVMF Memory Debug Logging Support

Overview:

This PR adds support to log OVMF debug messages to a memory buffer. The memory buffer can be extracted from a VM's QEMU process (or a core file) to debug issues.

Background:

OVMF currently offers 2 methods to collect debug messages - via the "debug IO port" (logged by QEMU to a host disk file) and via the serial port (-D DEBUG_ON_SERIAL_PORT). Both of these methods are impractical under normal VM operation. They increase VM boot time, clutter the serial port, and use host disk resources. In contrast, logging the debug messages to memory is fast and takes no disk resources allowing it to be enabled by default for customer environments. This allows for "first-time" issue analysis, negating the need to reproduce issues with debug messages enabled, which makes for a much better customer experience should an issue occur.

Code Overview:

The code introduces 3 new libraries:

  • OvmfPkg/Library/PlatformDebugLibMem: New DebugLib class library which logs debug messages to memory. This library was derived from PlatformDebugLibIoPort - simply altered to pass the debug message to MemDebugLogWrite() instead of writing the debug message to the debug IO port.

  • OvmfPkg/Library/MemDebugLogCommonLib: Library to manage access to the memory debug log circular buffer.

  • OvmfPkg/Library/MemDebugLogLib: Library which implements the key MemDebugLogWrite() function (called by PlatformDebugLibMem). There is a different implementation of this library for the various boot phases/module types (i.e. SEC, PEIM, DXE, Runtime). See more detail below.

The code also introduces a new PEIM:

  • OvmfPkg/MemDebugLogPei: When memory becomes available, this PEIM allocates the main memory debug log buffer and installs a PPI containing a function to write to the main memory buffer.

A bit of complication arises from the the fact that main memory is not available until the PEI phase and several debug messages are logged before then. To remedy this, the code makes use of a "early" memory buffer (taken from PeiMemFvBase) which is used to log the initial (pre-memory) messages. Once memory becomes available, the PEIM code allocates the main memory debug log buffer, copies the messages from the early buffer and then switches to use the main memory buffer (by installing a PPI/HOB).

Thus, the SEC version of MemDebugLogLib writes debug messages to the "early" debug log buffer. The PEIM version of MemDebugLogLib also writes initial messages to the early buffer until the MemDebugLogPei PPI is installed. Once the PPI is installed the PEI version of MemDebugLogLib switches to use the PPI. The DXE and runtime versions of the MemDebugLogLib write to the main memory debug log buffer (obtaining the address of the buffer via a HOB). The runtime version of MemDebugLogLib (used by DXE_RUNTIME_DRIVER type drivers) will allow debug writes until the OS makes the SetVirtualAddressMap() BS call.

All calls to MemDebugLogWrite() eventually end up calling MemDebugLogWriteCommon() (from MemDebugLogCommonLib) - which handles the details of maintaining the circular debug log buffer and header (with head/tail pointers, etc.). Since it is theoretically possible for multiple vcpus to attempt to write debug messages simultaneously (during vcpu init), the library uses a spinlock to protect the critical sections when accessing the buffer.

The feature (i.e. the new libraries and PEIM) are disabled by default and are enabled via the new "DEBUG_TO_MEM" build flag (which can be enabled on the build command line - similar to DEBUG_ON_SERIAL_PORT).

Notes:

The debug log memory buffer size can be configured via FwCfg and is circular - i.e. only the most recent debug messages are retained if the memory buffer overflows. This is appropriate as typically only the most recent debug messages are relevant when debugging an issue. The code currently defaults to a 32 page (128K) sized memory debug log buffer (the default is configured via a PCD).

A host-side tool/utility can be easily implemented to search the VM's QEMU memory regions to locate the OVMF memory debug log buffer (located on a page boundary), decipher the buffer header and display the circular buffer contents (debug messages). We (Oracle) already have such a utility which can extract the OVMF memory debug log from a live QEMU process or a QEMU core file.

This feature doesn't work with Confidential Computing VMs as the guest memory is encrypted and thus not readable from the host. A future enhancement could possibly be made to mark the OVMF memory debug log buffer as unencrypted (?) TBD

This PR only covers OVMF (x86_64) but was written to be easily extended to AAVMF (Arm) in a future PR. TBD

  • Breaking change?
    • Breaking change - Does this PR cause a break in build or boot behavior?
    • Examples: Does it add a new library class or move a module to a different repo.
  • Impacts security?
    • Security - Does this PR have a direct security impact?
    • Examples: Crypto algorithm change or buffer overflow fix.
  • Includes tests?
    • Tests - Does this PR include any explicit test code?
    • Examples: Unit tests or integration tests.

How This Was Tested

OVMF builds for both OvmfPkgIa32X64.dsc and OvmfPkgX64.dsc (both with and without -D DEBUG_TO_MEM) were built/tested and the OVMF Memory Debug Log extracted from the VM's QEMU process (by a custom utility) and assessed for correctness.

Integration Instructions

N/A

The OVMF Memory Debug Logging feature captures DEBUG() messages
into a memory buffer allowing for extraction of debug messages
directly from a qemu process or core file.

Add the GUIDs and PCDs definitions required for the
OVMF Memory Debug Logging feature.

Cc: Gerd Hoffmann <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Jiewen Yao <[email protected]>
Signed-off-by: Aaron Young <[email protected]>
Add the Memory Debug Logging feature MemDebugLogCommonLib library
which provides core functions to maintain the circular memory debug
log buffer.

Cc: Gerd Hoffmann <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Jiewen Yao <[email protected]>
Signed-off-by: Aaron Young <[email protected]>
Add the Memory Debug Logging feature PEI Module which
is responsible for allocating and initializing the memory
log buffer and providing the PPI to write to the buffer.

Cc: Gerd Hoffmann <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Jiewen Yao <[email protected]>
Signed-off-by: Aaron Young <[email protected]>
Add the Memory Debug Logging feature MemDebugLogLib library
which provides the key MemDebugLogWrite() function.

Several versions (i.e. SEC, PEIM, DXE, runtime) of
the library are included to provide the proper
method to write the debug messages to the memory
debug log buffer.

Cc: Gerd Hoffmann <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Jiewen Yao <[email protected]>
Signed-off-by: Aaron Young <[email protected]>
Add the new DebugLib class PlatformDebugLibMem library
to write debug messages to memory.

Cc: Gerd Hoffmann <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Jiewen Yao <[email protected]>
Signed-off-by: Aaron Young <[email protected]>
Reserve the early memory debug log buffer.

Cc: Gerd Hoffmann <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Jiewen Yao <[email protected]>
Signed-off-by: Aaron Young <[email protected]>
Add the OVMF Memory Debug Logging feature to the
Ia32X64 and X64 OVMF builds.

The feature is enabled via the DEBUG_TO_MEM build flag.

Cc: Gerd Hoffmann <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Jiewen Yao <[email protected]>
Signed-off-by: Aaron Young <[email protected]>
@github-actions github-actions bot added the impact:breaking-change This change breaks existing APIs impacting build or boot. label Apr 23, 2025
@kraxel
Copy link
Member

kraxel commented Apr 24, 2025

Any specific reason why this is splitted into three different libraries? I don't think this is needed. You can put all this into one directory and create three slightly different versions of the DebugLib library for SEC, PEI and DXE. Look at OvmfPkg/Library/QemuFwCfgLib for example which has QemuFwCfg{Sec,Pei,Dxe}Lib.inf. Some code is in sec/pei/dxe specific source file but most code is in shared source files which is linked into all library variants.

I think it makes sense to install an EFI configuration table describing the log buffer, so the operating system can find the log buffer too. The linux kernel could easily export it via sysfs then.

I think it makes sense to support logging to both debugcon and memory. OVMF detects whenever a debugcon device is present or not; most of the logging overhead goes away in case the device is not preset. Having debugcon enabled in builds make sense IMHO. Having memory logging clearly is useful too.

Do you have pointers to the log extraction utilities mentioned?

//
// Structure version- MUST be third field and set to 1
//
UINT8 Version;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to use a headersize field instead. When extending the structure just append fields to the header. Existing old software will be able to parse the parts of the structure they know about, providing better backward compatibility.

I'd also suggest to make sure all struct fields are aligned. Unaligned field access is a PITA on !x86 architectures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Will do. Thanks

//
// Protect the log from MP access
//
volatile UINT64 MemDebugLogLock;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a more verbose comment how this works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Thanks

//
// Firmware Build Version (PcdFirmwareVersionString)
//
CHAR8 FirmwareVersion[32];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather short, I'd suggest at least 64 bytes. Especially test builds (with bug identifier and timestamp included in the version string) are easily longer than that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do

@ajyoung-oracle
Copy link
Contributor Author

Any specific reason why this is splitted into three different libraries? I don't think this is needed. You can put all this into one directory and create three slightly different versions of the DebugLib library for SEC, PEI and DXE. Look at OvmfPkg/Library/QemuFwCfgLib for example which has QemuFwCfg{Sec,Pei,Dxe}Lib.inf. Some code is in sec/pei/dxe specific source file but most code is in shared source files which is linked into all library variants.

I think it makes sense to support logging to both debugcon and memory. OVMF detects whenever a debugcon device is present or not; most of the logging overhead goes away in case the device is not preset. Having debugcon enabled in builds make sense IMHO. Having memory logging clearly is useful too.

Thanks for the review. Yes, I can combine the Libraries into one. Will do.

I also like the idea to make it possible to write debug messages to both debugcon and memory simultaneously. How about serial too? Should I create a new OvmfPkg specific DebugLib that can write to any combination of debugcon, mem and serial? With debugcon as the default and mem and serial optionally selectable (via -D DEBUG_TO_MEM and -D DEBUG_ON_SERIAL_PORT respectively)? Seems that should be easy to do. What do you think?

@ajyoung-oracle
Copy link
Contributor Author

I think it makes sense to install an EFI configuration table describing the log buffer, so the operating system can find the log buffer too. The linux kernel could easily export it via sysfs then.

OK. I think I'd prefer to address this in a separate subsequent PR if that's OK?

@kraxel
Copy link
Member

kraxel commented Apr 25, 2025

I also like the idea to make it possible to write debug messages to both debugcon and memory simultaneously. How about serial too? Should I create a new OvmfPkg specific DebugLib that can write to any combination of debugcon, mem and serial? With debugcon as the default and mem and serial optionally selectable (via -D DEBUG_TO_MEM and -D DEBUG_ON_SERIAL_PORT respectively)? Seems that should be easy to do. What do you think?

I think the interesting combinations are serial+memory (for non-qemu platforms) and debugcon+memory (qemu platforms). Logging to both serial+debugcon looks pointless to me. debugcon+memory is probably easiest to implement because debugcon is a OvmfPkg library anyway so we only need to change OvmfPkg for that. The memory logging feature could be implemented as MemoryLog library and PlatformDebugLibIoPort could simply call into the libary to store a copy of the log message in the memory log buffer too.

Doing the same for serial+memory requires the MemoryLog library API (and Null implementation) being in MdePkg, so the same logic can be added to BaseDebugLibSerialPort without adding a OvmfPkg dependency to MdePkg.

@kraxel
Copy link
Member

kraxel commented Apr 25, 2025

I think it makes sense to install an EFI configuration table describing the log buffer, so the operating system can find the log buffer too. The linux kernel could easily export it via sysfs then.

OK. I think I'd prefer to address this in a separate subsequent PR if that's OK?

Fine with me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact:breaking-change This change breaks existing APIs impacting build or boot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants