apache · vishesh92 · Jul 7, 2025
diff --git a/source/adminguide/hosts.rst b/source/adminguide/hosts.rst
@@ -223,6 +223,38 @@ Following hypervisor-specific documentations can be referred for different maxim
    Guest Instance limit check is not done while deploying an Instance on a KVM hypervisor host.
 
 
+.. _discovering-gpu-devices-on-hosts:
+
+Discovering GPU Devices on Hosts
+--------------------------------
+
+For KVM, the user needs to ensure that IOMMU is enabled and the necessary
+drivers are installed. If vGPU is to be used, the user needs to ensure that
+the vGPU type is supported by the host and has been created on the host. The
+cloudstack agent uses the ``gpudiscovery.sh`` script to discover the GPU devices
+on the host. For more information on how to prepare the host for GPU
+passthrough, see `Managing GPU devices in virtual machines <https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/configuring_and_managing_virtualization/assembly_managing-gpu-devices-in-virtual-machines_configuring-and-managing-virtualization>`_.
+
+Once the host is configured with the GPU devices, the operator can trigger the
+discovery of the GPU devices on the host by using ``discoverGPUdevices`` command
+using cmk or use the ``Discover GPU devices`` button on the host details page in the UI.
+This triggers a request to the cloudstack agent to discover the GPU devices on
+the host.
+
+The cloudstack agent uses the ``gpudiscovery.sh`` script to discover the GPU
+devices on a KVM host. The script is located in the
+``/usr/share/cloudstack-common/scripts/vm/`` directory on the host.
+
+.. note::
+   The script can be run manually to debug the discovery of the GPU devices on a host.
+
+   .. parsed-literal::
+
+      sudo /usr/share/cloudstack-common/scripts/vm/gpudiscovery.sh
+
+   The script will output the GPU devices in a JSON found on the host. The operator
+   can also update the script to customize the discovery of the GPU devices on the host.
+
 
 Changing Host Password
 ----------------------

diff --git a/source/adminguide/service_offerings.rst b/source/adminguide/service_offerings.rst
@@ -289,22 +289,22 @@ To create a new compute offering:
       -  Preferred: The instance will be deployed in dedicated infrastructure if
          possible. Otherwise, the instance can be deployed in shared infrastructure.
 
-   -  **GPU**: Assign a physical GPU(GPU-passthrough) or a portion of a physical
+   -  **GPU Card**: Assign a physical GPU(GPU-passthrough) or a portion of a physical
       GPU card (vGPU) to the guest instance. It allows graphical applications to run on the instance.
       Select the card from the supported list of cards.
-      The options given are NVIDIA GRID K1 and NVIDIA GRID K2. These are vGPU
-      capable cards that allow multiple vGPUs on a single physical GPU. If you
-      want to use a card other than these, follow the instructions in the
-      **"GPU and vGPU support for CloudStack Guest instances"** page in the
-      Cloudstack Version 4.4 Design Docs found in the Cloudstack Wiki.
 
-   -  **vGPU Type**: Represents the type of virtual GPU to be assigned to a
+   -  **GPU Profile**: Represents the type of virtual GPU to be assigned to a
       guest instance. In this case, only a portion of a physical GPU card (vGPU) is
       assigned to the guest instance.
-      Additionally, the **passthrough vGPU** type is defined to represent a physical GPU
-      device. A **passthrough vGPU** can directly be assigned to a single guest instance.
-      In this case, a physical GPU device is exclusively allotted to a single
-      guest instance.
+      Additionally, the **passthrough** type is defined to represent a physical GPU
+      device. A **passthrough** can directly be assigned to a single guest instance.
+      In this case, the physical GPU devices are exclusively allotted to a single guest instance.
+
+   -  **GPU Count**: The number of GPUs to be assigned to the guest instance.
+      This is applicable only for KVM hypervisor.
+
+   -  **GPU Display**: Whether to use the GPU device attached to the guest instance for display.
+      This is applicable only for KVM hypervisor.
 
    -  **Public**: Indicate whether the compute offering should be
       available to all domains or only some domains. Choose Yes to make it

diff --git a/source/adminguide/usage.rst b/source/adminguide/usage.rst
@@ -249,20 +249,29 @@ max.account.cpus                    Maximum number of CPU cores that can be used
                                     Default is 40.
 max.account.ram (MB)                Maximum RAM that can be used for an Account.
                                     Default is 40960.
+max.account.gpus                    Maximum number of GPUs that can be used for an Account.
+                                    Default is 20.
 max.account.primary.storage (GB)    Maximum primary storage space that can be used for an Account.
                                     Default is 200.
 max.account.secondary.storage (GB)  Maximum secondary storage space that can be used for an Account.
                                     Default is 400.
-max.project.cpus                    Maximum number of CPU cores that can be used for an Account.
+max.project.cpus                    Maximum number of CPU cores that can be used for a Project.
                                     Default is 40.
-max.project.ram (MB)                Maximum RAM that can be used for an Account.
+max.project.ram (MB)                Maximum RAM that can be used for a Project.
                                     Default is 40960.
-max.project.primary.storage (GB)    Maximum primary storage space that can be used for an Account.
+max.project.gpus                    Maximum number of GPUs that can be used for a Project.
+                                    Default is 20.
+max.project.primary.storage (GB)    Maximum primary storage space that can be used for a Project.
                                     Default is 200.
-max.project.secondary.storage (GB)  Maximum secondary storage space that can be used for an Account.
+max.project.secondary.storage (GB)  Maximum secondary storage space that can be used for a Project.
                                     Default is 400.
 =================================== =================================================================
 
+The GPU devices are not detached when the Instance is stopped. Therefore,
+the GPU devices for stopped Instances are counted towards the resource limits.
+To avoid this, the administrator can set the `gpu.detach.on.stop` global
+setting to `true` to detach the GPU devices when the Instance is stopped.
+
 The administrator can also set limits for specific tagged host and storage
 resources for an account or domain. Such tags must be specified in the following
 global settings:

diff --git a/source/adminguide/virtual_machines.rst b/source/adminguide/virtual_machines.rst
@@ -1593,39 +1593,54 @@ CloudStack meet the intensive graphical processing requirement by means of the
 high computation power of GPU/vGPU, and CloudStack users can run multimedia
 rich applications, such as Auto-CAD, that they otherwise enjoy at their desk on
 a virtualized environment.
-CloudStack leverages the XenServer support for NVIDIA GRID Kepler 1 and 2 series
-to run GPU/vGPU enabled Instances. NVIDIA GRID cards allows sharing a single GPU cards
-among multiple Instances by creating vGPUs for each Instance. With vGPU technology, the
-graphics commands from each Instance are passed directly to the underlying dedicated
-GPU, without the intervention of the hypervisor. This allows the GPU hardware
-to be time-sliced and shared across multiple Instances. XenServer hosts use the GPU
-cards in following ways:
-
-**GPU passthrough**: GPU passthrough represents a physical GPU which can be
+
+For KVM, CloudStack leverages libvirt's PCI passthrough feature to assign a
+physical GPU to a guest Instance. For vGPU profiles, depending on the vGPU type,
+CloudStack uses mediated devices or Virtual Functions(VF) to assign a virtual
+GPU to a guest Instance. It's the responsibility of the operator to ensure that
+GPU devices are in correct state and are available for use on the host. If the
+operator wants to use vGPU profiles, they need to ensure that the vGPU type is
+supported by the host and has been created on the host.
+
+For XenServer, CloudStack leverages the XenServer support for NVIDIA GRID
+Kepler 1 and 2 series to run GPU/vGPU enabled Instances.
+
+Some NVIDIA cards allow sharing a single GPU card among multiple Instances by
+creating vGPUs for each Instance. With vGPU technology, the graphics commands
+from each Instance are passed directly to the underlying dedicated GPU, without
+the intervention of the hypervisor. This allows the GPU hardware to be
+time-sliced and shared across multiple Instances. The GPU cards are used in the
+following ways:
+
+**passthrough**: GPU passthrough represents a physical GPU which can be
 directly assigned to an Instance. GPU passthrough can be used on a hypervisor alongside
 GRID vGPU, with some restrictions: A GRID physical GPU can either host GRID
 vGPUs or be used as passthrough, but not both at the same time.
 
-**GRID vGPU**: GRID vGPU enables multiple Instances to share a single physical GPU.
+**vGPU**: vGPU enables multiple Instances to share a single physical GPU.
 The Instances run an NVIDIA driver stack and get direct access to the GPU. GRID
 physical GPUs are capable of supporting multiple virtual GPU devices (vGPUs)
-that can be assigned directly to guest Instances. Guest Instances use GRID virtual GPUs in
+that can be assigned directly to guest Instances. Guest Instances use vGPUs in
 the same manner as a physical GPU that has been passed through by the
 hypervisor: an NVIDIA driver loaded in the guest Instance provides direct access to
 the GPU for performance-critical fast paths, and a paravirtualized interface to
-the GRID Virtual GPU Manager, which is used for nonperformant management
-operations. NVIDIA GRID Virtual GPU Manager for XenServer runs in dom0.
+the NVIDIA vGPU Manager, which is used for nonperformant management
+operations. NVIDIA vGPU Manager for XenServer runs in dom0.
+
 CloudStack provides you with the following capabilities:
 
-- Adding XenServer hosts with GPU/vGPU capability provisioned by the administrator.
+- Adding hosts with GPU/vGPU capability provisioned by the administrator.
+  (Supports only XenServer & KVM)
 
-- Creating a Compute Offering with GPU/vGPU capability.
+- Creating a Compute Offering with GPU/vGPU capability. For KVM, it is possible to
+  specify the GPU count and whether to use the GPU for display. For XenServer,
+  GPU count is simply ignored and only one device is assigned to the guest Instance.
 
 - Deploying an Instance with GPU/vGPU capability.
 
 - Destroying an Instance with GPU/vGPU capability.
 
-- Allowing an user to add GPU/vGPU support to an Instance without GPU/vGPU support by
+- Allowing a user to add GPU/vGPU support to an Instance without GPU/vGPU support by
   changing the Service Offering and vice-versa.
 
 - Migrating Instances (cold migration) with GPU/vGPU capability.
@@ -1635,57 +1650,78 @@ CloudStack provides you with the following capabilities:
 - Querying hosts to obtain information about the GPU cards, supported vGPU types
   in case of GRID cards, and capacity of the cards.
 
+- Limit an account/domain/project to use a certain number of GPUs.
+
 Prerequisites and System Requirements
 -------------------------------------
 
 Before proceeding, ensure that you have these prerequisites:
 
-- The vGPU-enabled XenServer 6.2 and later versions.
-  For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
+- CloudStack does not restrict the deployment of GPU-enabled Instances with
+  guest OS types that are not supported for GPU/vGPU functionality. The deployment
+  would be successful and a GPU/vGPU will also get allocated for Instances; however,
+  due to missing guest OS drivers, Instance would not be able to leverage GPU resources.
+  Therefore, it is recommended to use GPU-enabled service offering only with supported guest OS.
+
+- NVIDIA GRID K1 (16 GiB video RAM) AND K2 (8 GiB of video RAM) cards supports
+  homogeneous virtual GPUs, implies that at any given time, the vGPUs resident on
+  a single physical GPU must be all of the same type. However, this restriction
+  doesn't extend across physical GPUs on the same card. Each physical GPU on a
+  K1 or K2 may host different types of virtual GPU at the same time. For example,
+  a GRID K2 card has two physical GPUs, and supports four types of virtual GPU;
+  GRID K200, GRID K220Q, GRID K240Q, AND GRID K260Q.
+
+- NVIDIA driver must be installed to enable vGPU operation as for a physical NVIDIA GPU.
 
-- GPU/vGPU functionality is supported for following HVM guest operating systems:
-  For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
 
-- Windows 7 (x86 and x64)
+For XenServer:
 
-- Windows Server 2008 R2
+- the vGPU-enabled XenServer 6.2 and later versions.
+  For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
 
-- Windows Server 2012
+- GPU/vGPU functionality is supported for following HVM guest operating systems:
+  For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
 
-- Windows 8 (x86 and x64)
+  - Windows 7 (x86 and x64)
 
-- Windows 8.1 ("Blue") (x86 and x64)
+  - Windows Server 2008 R2
 
-- Windows Server 2012 R2 (server equivalent of "Blue")
+  - Windows Server 2012
 
-- CloudStack does not restrict the deployment of GPU-enabled Instances with guest OS types that are not supported by XenServer for GPU/vGPU functionality. The deployment would be successful and a GPU/vGPU will also get allocated for Instances; however, due to missing guest OS drivers, Instance would not be able to leverage GPU resources. Therefore, it is recommended to use GPU-enabled service offering only with supported guest OS.
+  - Windows 8 (x86 and x64)
 
-- NVIDIA GRID K1 (16 GiB video RAM) AND K2 (8 GiB of video RAM) cards supports homogeneous virtual GPUs, implies that at any given time, the vGPUs resident on a single physical GPU must be all of the same type. However, this restriction doesn't extend across physical GPUs on the same card. Each physical GPU on a K1 or K2 may host different types of virtual GPU at the same time. For example, a GRID K2 card has two physical GPUs, and supports four types of virtual GPU; GRID K200, GRID K220Q, GRID K240Q, AND GRID K260Q.
+  - Windows 8.1 ("Blue") (x86 and x64)
 
-- NVIDIA driver must be installed to enable vGPU operation as for a physical NVIDIA GPU.
+  - Windows Server 2012 R2 (server equivalent of "Blue")
 
-- XenServer tools are installed in the Instance to get maximum performance on XenServer, regardless of type of vGPU you are using. Without the optimized networking and storage drivers that the XenServer tools provide, remote graphics applications running on GRID vGPU will not deliver maximum performance.
+- XenServer tools are installed in the Instance to get maximum performance on
+  XenServer, regardless of type of vGPU you are using. Without the optimized
+  networking and storage drivers that the XenServer tools provide, remote
+  graphics applications running on GRID vGPU will not deliver maximum performance.
 
-- To deliver high frames from multiple heads on vGPU, install XenDesktop with HDX 3D Pro remote graphics.
+- To deliver high frames from multiple heads on vGPU, install XenDesktop with
+  HDX 3D Pro remote graphics.
 
 Before continuing with configuration, consider the following:
 
-- Deploying Instances GPU/vGPU capability is not supported if hosts are not available with enough GPU capacity.
-
-- A Service Offering cannot be created with the GPU values that are not supported by CloudStack UI. However, you can make an API call to achieve this.
+- Deploying Instances with GPU/vGPU capability is not supported if hosts are
+  not available with enough GPU capacity.
 
-- Dynamic scaling is not supported. However, you can choose to deploy an Instance without GPU support, and at a later point, you can change the system offering to upgrade to the one with vGPU. You can achieve this by offline upgrade: stop the Instance, upgrade the Service Offering to the one with vGPU, then start the Instance.
+- Dynamic scaling is not supported. However, you can choose to deploy an
+  Instance without GPU support, and at a later point, you can change the system
+  offering to upgrade to the one with vGPU. You can achieve this by offline
+  upgrade: stop the Instance, upgrade the Service Offering to the one with
+  vGPU, then start the Instance.
 
 - Live migration of GPU/vGPU enabled Instance is not supported.
 
-- Limiting GPU resources per Account/Domain is not supported.
-
 - Disabling GPU at Cluster level is not supported.
 
 - Notification thresholds for GPU resource is not supported.
 
-Supported GPU Devices
----------------------
+
+Supported GPU Devices for XenServer
+-----------------------------------
 
 .. cssclass:: table-striped table-bordered table-hover
 
@@ -1710,14 +1746,17 @@ GPU/vGPU Assignment Workflow
 
 CloudStack follows the below sequence of operations to provide GPU/vGPU support for Instances:
 
-#. Ensure that XenServer host is ready with GPU installed and configured.
-   For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
+#. Ensure that the host is ready with GPU installed and configured.
+
+   - For more information for XenServer, see `XenServer Documentation <https://docs.xenserver.com/en-us/citrix-hypervisor/graphics/hv-graphics-config>`_.
+
+   - For KVM, to configure the host see how to `discover GPU Devices on Hosts here <hosts.html#discovering-gpu-devices-on-hosts>`_.
 
 #. Add the host to CloudStack.
    CloudStack checks if the host is GPU-enabled or not. CloudStack queries the host and detect if it's GPU enabled.
 
 #. Create a compute offering with GPU/vGPU support:
-   For more information, see `Creating a New Compute Offering <#creating-a-new-compute-offering>`__..
+   For more information, see `Creating a New Compute Offering <service_offerings.html#creating-a-new-compute-offering>`_.
 
 #. Continue with any of the following operations: