Configuring GPU virtualization

Before configuring GPU virtualization, you need to check whether your NVIDIA graphics card supports SR-IOV. The SR-IOV technology enables splitting a single physical device (physical function) into several virtual devices (virtual functions).

Legacy GPUs are based on the NVIDIA Tesla architecture and have no SR-IOV support. For such GPUs, virtualization is performed by creating a mediated device (mdev) over the physical function.
Modern GPUs are based on the NVIDIA Ampere architecture or newer and support SR-IOV. For such GPUs, virtualization is performed by creating a mdev over the virtual function.

For more details, refer to the official NVIDIA documentation.

Limitations

Virtual machines with attached vGPUs cannot be suspended and live migrated.
The default QLX driver for the VNC console and the NVIDIA GPU driver are incompatible
After installing the NVIDIA GPU driver inside a virtual machine with an attached vGPU, the VNC console stops working. You can use RDP for a remote connection. Alternatively, for templates that already have the NVIDIA GPU driver installed, you can set the hw_use_vgpu_display property, to disable the integrated QLX driver. For example:
```
# openstack --insecure image set --property hw_use_vgpu_display 007db63f-9b41-4918-b572-2c5eef4c8f4b
```

Prerequisites

Download the NVIDIA vGPU software drivers in the NVIDIA Licensing Portal (refer to NVIDIA vGPU Software).
To authorize further OpenStack commands, the OpenStack command-line client must be configured, as outlined in Connecting to OpenStack command-line interface.

Procedure overview

Prepare a compute node for GPU virtualization.
Reconfigure the compute cluster to enable vGPU support.
Change the vGPU type for a physical GPU.
Check that the node has vGPU resources for allocation.
Create a virtual machine with an attached vGPU.
Verify the attached GPU in the virtual machine.

To enable vGPU on a node

List all graphics cards on the node and obtain their PCI addresses:

# lspci -D | grep NVIDIA
0000:01:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
0000:81:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)

In the command output, 0000:01:00.0 and 0000:81:00.0 are the PCI addresses of the graphics cards.

On the node with the physical GPU, do one of the following:
- If the physical GPU is attached to the node
  Blacklist the Nouveau driver:
```
# rmmod nouveau
# echo -e "blacklist nouveau\noptions nouveau modeset=0" > /usr/lib/modprobe.d/nouveau.conf
# echo -e "blacklist nouveau\noptions nouveau modeset=0" > /etc/modprobe.d/nouveau.conf
```
- If the physical GPU is detached from the node
  1. In the /etc/default/grub file, locate the GRUB_CMDLINE_LINUX line, and then delete pci-stub.ids=<gpu_vid>:<gpu_pid>. For example, for a GPU with the VID and PID 10de:1eb8, delete pci-stub.ids=10de:1eb8, and check the resulting file:
    # cat /etc/default/grub | grep CMDLINE GRUB_CMDLINE_LINUX="crashkernel=auto tcache.enabled=0 quiet iommu=pt rd.driver.blacklist=nouveau nouveau.modeset=0"
  2. Regenerate the GRUB configuration file.
    
    On a BIOS-based system, run:
    # /usr/sbin/grub2-mkconfig -o /etc/grub2.cfg --update-bls-cmdline
    
    On a UEFI-based system, run:
    # /usr/sbin/grub2-mkconfig -o /etc/grub2-efi.cfg --update-bls-cmdline
  3. Reboot the node to apply the changes:
    # reboot
Install the vGPU NVIDIA driver:
1. Install the kernel-devel and dkms packages:
```
# dnf install kernel-devel dkms
```
2. Enable and start the dkms service:
```
# systemctl enable dkms.service 
# systemctl start dkms.service
```
3. Install the vGPU KVM kernel module from the NVIDIA GRID package with the --dkms option:
```
# bash NVIDIA-Linux-x86_64-xxx.xx.xx-vgpu-kvm*.run --dkms
```
4. Re-create the Linux boot image by running:
```
# dracut -f
```
Enable IOMMU on the node by running the pci-helper.py enable-iommu script and reboot the node to apply the changes:
```
# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py enable-iommu
# reboot
```
The script works for both Intel and AMD processors.

Verify that IOMMU is enabled in the dmesg output:

# dmesg | grep -e DMAR -e IOMMU
[    0.000000] DMAR: IOMMU enabled

[For modern GPUs with SR-IOV support] Enable the virtual functions for your GPU:
```
# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py nvidia-sriov-mgr --enable
```
Verify that vGPU is enabled on the node:
- [For legacy GPUs without SR-IOV support] Check the /sys/bus/pci/devices/<pci_address>/mdev_supported_types directory. For example, for the GPU with the PCI address 0000:01:00.0, run:
```
ls /sys/bus/pci/devices/0000\:01:00.0/mdev_supported_types
nvidia-222  nvidia-223  nvidia-224  nvidia-225  nvidia-226  nvidia-227  nvidia-228  nvidia-229  nvidia-230  nvidia-231
nvidia-232  nvidia-233  nvidia-234  nvidia-252  nvidia-319  nvidia-320  nvidia-321
```
  For a vGPU-enabled card, the directory contains a list of supported vGPU types. A vGPU type is a vGPU configuration that defines the vRAM size, maximum resolution, maximum number of supported vGPUs, and other parameters.
- [For modern GPUs with SR-IOV support] Check supported vGPU types and the number of available instances per vGPU type. For example, for the GPU with the PCI address 0000:c1:00.0, run:
```
# cd /sys/bus/pci/devices/0000:c1:00.0/virtfn0/mdev_supported_types
# grep -vR --include=available_instances 0
nvidia-568/available_instances:1
nvidia-558/available_instances:1
nvidia-556/available_instances:1
```
  In the command output, the supported types are nvidia-568, nvidia-558, and nvidia-556 and each virtual function can host one instance.

To enable vGPU support for the compute cluster

Create a configuration file in the YAML format. For example:
```
# cat << EOF > pci-passthrough.yaml
- node_id: 1d6481c2-1fd5-406b-a0c7-330f24bd0e3d
  devices:
    - device_type: pgpu
      device: "0000:01:00.0"
      vgpu_type: nvidia-224
    - device_type: pgpu
      device: "0000:81:00.0"
      vgpu_type: nvidia-232
EOF
```
In this example:
- node_id is the UUID of the compute node that hosts a a physical GPU
- pgpu is the device type for a physical GPU that will be virtualized
- "0000:01:00.0" and "0000:81:00.0" are the PCI addresses of physical GPUs
- nvidia-224 and nvidia-232 are the vGPU type that will be enabled for a physical GPU
If a compute node has multiple graphics cards, it can be configured for both GPU passthrough and virtualization.

Reconfigure the compute cluster by using this configuration file:

# vinfra service compute set --pci-passthrough-config pci-passthrough.yaml
+---------+--------------------------------------+
| Field   | Value                                |
+---------+--------------------------------------+
| task_id | 89c8a6c4-f480-424e-ab44-c2f4e2976eb9 |
+---------+--------------------------------------+

Check the status of the task:

# vinfra task show 89c8a6c4-f480-424e-ab44-c2f4e2976eb9

To change the vGPU type for a physical GPU

Ensure that the compute cluster has no virtual machines with the current vGPU type.
Modify the configuration file. For example, replace nvidia-224 with nvidia-231 in the vgpu_type field:
```
- device_type: pgpu
  device: "0000:01:00.0"
  vgpu_type: nvidia-231
```
Pass the configuration file to the vinfra service compute set command. For example:
```
# vinfra service compute set --pci-passthrough-config config.yaml
```
Reboot the node with the physical GPU to apply changes:
```
# reboot
```

To check that a node has vGPU resources for allocation

List resource providers in the compute cluster to obtain their IDs. For example:

# openstack --insecure resource provider list
+--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+
| uuid                                 | name                                    | generation | root_provider_uuid                   | parent_provider_uuid                 |
+--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+
| 359cccf7-9c64-4edc-a35d-f4673e485a04 | node001.vstoragedomain_pci_0000_01_00_0 |          1 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | 4936695a-4711-425a-b0e4-fdab5e4688d6 |
| b8443d1b-b941-4bf5-ab4b-2dc7c64ac7d1 | node001.vstoragedomain_pci_0000_81_00_0 |          1 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | 4936695a-4711-425a-b0e4-fdab5e4688d6 |
| 4936695a-4711-425a-b0e4-fdab5e4688d6 | node001.vstoragedomain                  |        823 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | None                                 |
+--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+

In this output, the resource provider with the ID 4936695a-4711-425a-b0e4-fdab5e4688d6 has two child resource providers for two physical GPUs with PCI addresses 0000_01_00_0 and 0000_81_00_0.

Use the obtained ID of a child resource provider to list its inventory. For example:

# openstack --insecure resource provider inventory list 359cccf7-9c64-4edc-a35d-f4673e485a04
+----------------+------------------+----------+----------+-----------+----------+-------+
| resource_class | allocation_ratio | max_unit | reserved | step_size | min_unit | total |
+----------------+------------------+----------+----------+-----------+----------+-------+
| VGPU           |              1.0 |        8 |        0 |         1 |        1 |     8 |
+----------------+------------------+----------+----------+-----------+----------+-------+

The child resource provider has vGPU resources that can be allocated to virtual machines.

To create a virtual machine with a vGPU

If you use only one vGPU type in the compute cluster, you need to create a flavor that requests one virtual GPU, and then create virtual machines with this flavor:
1. Create a flavor with the resources property specifying the number of vGPUs to use. For example, to create the vgpu-flavor flavor with 2 vCPUs and 4 GiB of RAM, run:
```
# openstack --insecure flavor create --ram 4096 --vcpus 2 --property resources:VGPU=1 --public vgpu-flavor
```
2. Some drivers may require to hide the hypervisor signature. To do this, add the hide_hypervisor_id property to the flavor:
```
# openstack --insecure flavor set vgpu-flavor --property hide_hypervisor_id=true
```
3. Create a boot volume from an image (for example, Ubuntu):
```
# openstack --insecure volume create --size 20 --image ubuntu vgpu-boot-volume
```
4. Create a virtual machine specifying vgpu-flavor and vgpu-boot-volume. For example, to create the VM vgpu-vm, run:
```
# openstack --insecure server create --flavor vgpu-flavor --volume vgpu-boot-volume --network <network_name> vgpu-vm
```
The created virtual machine will have a virtual GPU of the type that is configured in the compute cluster.

If you want to use multiple vGPU types in the compute cluster, you need to manually create CUSTOM_NVIDIA_XXX traits, assign them to corresponding vGPU resource providers, and then proceed to create flavors and virtual machines with the assigned traits:

List resource providers in the compute cluster to obtain their IDs. For example:

# openstack --insecure resource provider list
+--------------------------------------+-----------------------------------------+------------+--------------------+----------------------+
| uuid                                 | name                                    | generation | root_provider_uuid | parent_provider_uuid |
+--------------------------------------+-----------------------------------------+------------+--------------------+----------------------+
| 7d2ef259-42df-4ef8-8eaa-66c3b7448fc3 | node001.vstoragedomain_pci_0000_85_00_0 |         62 | 1f08c319-f270-<…>  | 1f08c319-f270-<…>    |
| 94a84fc6-2f28-46d5-93e1-e588e347dd3b | node001.vstoragedomain_pci_0000_10_00_0 |         38 | 1f08c319-f270-<…>  | 1f08c319-f270-<…>    |
| 41c177e3-6998-4e56-8d29-f98f72fef910 | node002.vstoragedomain_pci_0000_85_00_0 |         13 | 9dbc8c64-0048-<…>  | 9dbc8c64-0048-<…>    |
| 7fd1d10f-9ceb-4cd1-acec-a1254755211b | node002.vstoragedomain_pci_0000_10_00_0 |         13 | 9dbc8c64-0048-<…>  | 9dbc8c64-0048-<…>    |
+--------------------------------------+-----------------------------------------+------------+--------------------+----------------------+

Create custom traits that correspond to different GPU types. For example, to create the traits CUSTOM_NVIDIA_231 and CUSTOM_NVIDIA_232, run:
```
# openstack --insecure trait create CUSTOM_NVIDIA_231
# openstack --insecure trait create CUSTOM_NVIDIA_232
```

Add the corresponding trait to the resource provider matching the GPU. For example:

# openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_231 7d2ef259-42df-4ef8-8eaa-66c3b7448fc3
+-------------------+
| name              |
+-------------------+
| CUSTOM_NVIDIA_231 |
+-------------------+
# openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_231 94a84fc6-2f28-46d5-93e1-e588e347dd3b
+-------------------+
| name              |
+-------------------+
| CUSTOM_NVIDIA_231 |
+-------------------+

Now, the trait CUSTOM_NVIDIA_231 is assigned to the vGPU resource providers of the node node001. To assign the trait CUSTOM_NVIDIA_232 to the vGPU resource providers of the node node002, run:

# openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_232 41c177e3-6998-4e56-8d29-f98f72fef910
+-------------------+
| name              |
+-------------------+
| CUSTOM_NVIDIA_232 |
+-------------------+
# openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_232 7fd1d10f-9ceb-4cd1-acec-a1254755211b
+-------------------+
| name              |
+-------------------+
| CUSTOM_NVIDIA_232 |
+-------------------+

Create flavors with the resources property specifying the number of vGPUs to use. For example, to create the vgpu231-flavor flavor with 2 vCPUs and 4 GiB of RAM and the vgpu232-flavor flavor with 4 vCPUs and 8 GiB of RAM, run:
```
# openstack --insecure flavor create --ram 4096 --vcpus 2 --property resources:VGPU=1 --public vgpu231-flavor
# openstack --insecure flavor create --ram 8192 --vcpus 4 --property resources:VGPU=1 --public vgpu232-flavor
```

Add the requested traits to your flavors. For example, to add the traits CUSTOM_NVIDIA_231 and CUSTOM_NVIDIA_232 to the flavors vgpu231-flavor and vgpu232-flavor, run:

# openstack --insecure flavor set --property trait:CUSTOM_NVIDIA_231=required vgpu231-flavor
# openstack --insecure flavor set --property trait:CUSTOM_NVIDIA_232=required vgpu232-flavor

Create a boot volume from an image (for example, Ubuntu):

# openstack --insecure volume create --size 20 --image ubuntu vgpu-boot-volume

Create virtual machines specifying the prepared flavors and vgpu-boot-volume. For example, to create the VM vgpu231-vm with the vgpu231-flavor flavor and the VM vgpu232-vm with the vgpu232-flavor flavor, run:

# openstack --insecure server create --volume vgpu-boot-volume --flavor vgpu231-flavor --network <network_name> vgpu231-vm
# openstack --insecure server create --volume vgpu-boot-volume --flavor vgpu232-flavor --network <network_name> vgpu232-vm

The created virtual machines will have virtual GPUs of different types that are configured in the compute cluster.

To check the GPU in a virtual machine

Install the NVIDIA drivers:

# sudo apt update && sudo apt install -y nvidia-driver-470 nvidia-utils-470

Check the GPU by running:
```
# nvidia-smi
```
The GPU should be recognized and operational.