Configuring GPU virtualization
Before configuring GPU virtualization, you need to check whether your NVIDIA graphics card supports SR-IOV. The SR-IOV technology enables splitting a single physical device (physical function) into several virtual devices (virtual functions).
- Legacy GPUs are based on the NVIDIA Tesla architecture and have no SR-IOV support. For such GPUs, virtualization is performed by creating a mediated device (mdev) over the physical function.
- Modern GPUs are based on the NVIDIA Ampere architecture or newer and support SR-IOV. For such GPUs, virtualization is performed by creating a mdev over the virtual function.
For more details, refer to the official NVIDIA documentation.
Limitations
- Virtual machines with attached vGPUs cannot be suspended and live migrated.
-
The default QLX driver for the VNC console and the NVIDIA GPU driver are incompatible
After installing the NVIDIA GPU driver inside a virtual machine with an attached vGPU, the VNC console stops working. You can use RDP for a remote connection. Alternatively, for templates that already have the NVIDIA GPU driver installed, you can set the
hw_use_vgpu_display
property, to disable the integrated QLX driver. For example:# openstack --insecure image set --property hw_use_vgpu_display 007db63f-9b41-4918-b572-2c5eef4c8f4b
Prerequisites
- Download the NVIDIA vGPU software drivers in the NVIDIA Licensing Portal (refer to NVIDIA vGPU Software).
- To authorize further OpenStack commands, the OpenStack command-line client must be configured, as outlined in Connecting to OpenStack command-line interface.
Procedure overview
- Prepare a compute node for GPU virtualization.
- Reconfigure the compute cluster to enable vGPU support.
-
Change the vGPU type for a physical GPU.
- Check that the node has vGPU resources for allocation.
- Create a virtual machine with an attached vGPU.
- Verify the attached GPU in the virtual machine.
To enable vGPU on a node
-
List all graphics cards on the node and obtain their PCI addresses:
# lspci -D | grep NVIDIA 0000:01:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1) 0000:81:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
In the command output,
0000:01:00.0
and0000:81:00.0
are the PCI addresses of the graphics cards. -
On the node with the physical GPU, do one of the following:
-
If the physical GPU is attached to the node
Blacklist the Nouveau driver:
# rmmod nouveau # echo -e "blacklist nouveau\noptions nouveau modeset=0" > /usr/lib/modprobe.d/nouveau.conf # echo -e "blacklist nouveau\noptions nouveau modeset=0" > /etc/modprobe.d/nouveau.conf
-
If the physical GPU is detached from the node
-
In the /etc/default/grub file, locate the
GRUB_CMDLINE_LINUX
line, and then deletepci-stub.ids=<gpu_vid>:<gpu_pid>
. For example, for a GPU with the VID and PID10de:1eb8
, deletepci-stub.ids=10de:1eb8
, and check the resulting file:# cat /etc/default/grub | grep CMDLINE GRUB_CMDLINE_LINUX="crashkernel=auto tcache.enabled=0 quiet iommu=pt rd.driver.blacklist=nouveau nouveau.modeset=0"
-
Regenerate the GRUB configuration file.
-
On a BIOS-based system, run:
# /usr/sbin/grub2-mkconfig -o /etc/grub2.cfg --update-bls-cmdline
-
On a UEFI-based system, run:
# /usr/sbin/grub2-mkconfig -o /etc/grub2-efi.cfg --update-bls-cmdline
-
-
Reboot the node to apply the changes:
# reboot
-
-
-
Install the vGPU NVIDIA driver:
-
Install the
kernel-devel
anddkms
packages:# dnf install kernel-devel dkms
-
Enable and start the
dkms
service:# systemctl enable dkms.service # systemctl start dkms.service
-
Install the vGPU KVM kernel module from the NVIDIA GRID package with the
--dkms
option:# bash NVIDIA-Linux-x86_64-xxx.xx.xx-vgpu-kvm*.run --dkms
-
Re-create the Linux boot image by running:
# dracut -f
-
-
Enable IOMMU on the node by running the
pci-helper.py enable-iommu
script and reboot the node to apply the changes:# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py enable-iommu # reboot
The script works for both Intel and AMD processors.
-
Verify that IOMMU is enabled in the
dmesg
output:# dmesg | grep -e DMAR -e IOMMU [ 0.000000] DMAR: IOMMU enabled
-
[For modern GPUs with SR-IOV support] Enable the virtual functions for your GPU:
# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py nvidia-sriov-mgr --enable
-
Verify that vGPU is enabled on the node:
-
[For legacy GPUs without SR-IOV support] Check the
/sys/bus/pci/devices/<pci_address>/mdev_supported_types
directory. For example, for the GPU with the PCI address0000:01:00.0
, run:ls /sys/bus/pci/devices/0000\:01:00.0/mdev_supported_types nvidia-222 nvidia-223 nvidia-224 nvidia-225 nvidia-226 nvidia-227 nvidia-228 nvidia-229 nvidia-230 nvidia-231 nvidia-232 nvidia-233 nvidia-234 nvidia-252 nvidia-319 nvidia-320 nvidia-321
For a vGPU-enabled card, the directory contains a list of supported vGPU types. A vGPU type is a vGPU configuration that defines the vRAM size, maximum resolution, maximum number of supported vGPUs, and other parameters.
-
[For modern GPUs with SR-IOV support] Check supported vGPU types and the number of available instances per vGPU type. For example, for the GPU with the PCI address
0000:c1:00.0
, run:# cd /sys/bus/pci/devices/0000:c1:00.0/virtfn0/mdev_supported_types # grep -vR --include=available_instances 0 nvidia-568/available_instances:1 nvidia-558/available_instances:1 nvidia-556/available_instances:1
In the command output, the supported types are
nvidia-568
,nvidia-558
, andnvidia-556
and each virtual function can host one instance.
-
To enable vGPU support for the compute cluster
-
Create a configuration file in the YAML format. For example:
# cat << EOF > pci-passthrough.yaml - node_id: 1d6481c2-1fd5-406b-a0c7-330f24bd0e3d devices: - device_type: pgpu device: "0000:01:00.0" vgpu_type: nvidia-224 - device_type: pgpu device: "0000:81:00.0" vgpu_type: nvidia-232 EOF
In this example:
node_id
is the UUID of the compute node that hosts a a physical GPUpgpu
is the device type for a physical GPU that will be virtualized"0000:01:00.0"
and"0000:81:00.0"
are the PCI addresses of physical GPUsnvidia-224
andnvidia-232
are the vGPU type that will be enabled for a physical GPU
If a compute node has multiple graphics cards, it can be configured for both GPU passthrough and virtualization.
-
Reconfigure the compute cluster by using this configuration file:
# vinfra service compute set --pci-passthrough-config pci-passthrough.yaml +---------+--------------------------------------+ | Field | Value | +---------+--------------------------------------+ | task_id | 89c8a6c4-f480-424e-ab44-c2f4e2976eb9 | +---------+--------------------------------------+
-
Check the status of the task:
# vinfra task show 89c8a6c4-f480-424e-ab44-c2f4e2976eb9
If the compute configuration fails
Check whether the following error appears in
/var/log/vstorage-ui-backend/ansible.log
:2021-09-23 16:42:59,796 p=32130 u=vstoradmin | fatal: [32c8461b-92ec-48c3-ae02- 4d12194acd02]: FAILED! => {"changed": true, "cmd": "echo 4 > /sys/class/net/ enp103s0f1/device/sriov_numvfs", "delta": "0:00:00.127417", "end": "2021-09-23 19:42:59.784281", "msg": "non-zero return code", "rc": 1, "start": "2021-09-23 19:42:59.656864", "stderr": "/bin/sh: line 0: echo: write error: Cannot allocate memory", "stderr_lines": ["/bin/sh: line 0: echo: write error: Cannot allocate memory"], "stdout": "", "stdout_lines": []}
In this case, run the the
pci-helper.py
script, and reboot the node:# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py enable-iommu --pci-realloc # reboot
When the node is up again, repeat the
vinfra service compute set
command.
To change the vGPU type for a physical GPU
- Ensure that the compute cluster has no virtual machines with the current vGPU type.
-
Modify the configuration file. For example, replace
nvidia-224
withnvidia-231
in thevgpu_type
field:- device_type: pgpu device: "0000:01:00.0" vgpu_type: nvidia-231
-
Pass the configuration file to the
vinfra service compute set
command. For example:# vinfra service compute set --pci-passthrough-config config.yaml
-
Reboot the node with the physical GPU to apply changes:
# reboot
To check that a node has vGPU resources for allocation
-
List resource providers in the compute cluster to obtain their IDs. For example:
# openstack --insecure resource provider list +--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+ | uuid | name | generation | root_provider_uuid | parent_provider_uuid | +--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+ | 359cccf7-9c64-4edc-a35d-f4673e485a04 | node001.vstoragedomain_pci_0000_01_00_0 | 1 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | | b8443d1b-b941-4bf5-ab4b-2dc7c64ac7d1 | node001.vstoragedomain_pci_0000_81_00_0 | 1 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | | 4936695a-4711-425a-b0e4-fdab5e4688d6 | node001.vstoragedomain | 823 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | None | +--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+
In this output, the resource provider with the ID
4936695a-4711-425a-b0e4-fdab5e4688d6
has two child resource providers for two physical GPUs with PCI addresses0000_01_00_0
and0000_81_00_0
. -
Use the obtained ID of a child resource provider to list its inventory. For example:
# openstack --insecure resource provider inventory list 359cccf7-9c64-4edc-a35d-f4673e485a04 +----------------+------------------+----------+----------+-----------+----------+-------+ | resource_class | allocation_ratio | max_unit | reserved | step_size | min_unit | total | +----------------+------------------+----------+----------+-----------+----------+-------+ | VGPU | 1.0 | 8 | 0 | 1 | 1 | 8 | +----------------+------------------+----------+----------+-----------+----------+-------+
The child resource provider has vGPU resources that can be allocated to virtual machines.
To create a virtual machine with a vGPU
-
If you use only one vGPU type in the compute cluster, you need to create a flavor that requests one virtual GPU, and then create virtual machines with this flavor:
-
Create a flavor with the
resources
property specifying the number of vGPUs to use. For example, to create thevgpu-flavor
flavor with 2 vCPUs and 4 GiB of RAM, run:# openstack --insecure flavor create --ram 4096 --vcpus 2 --property resources:VGPU=1 --public vgpu-flavor
-
Some drivers may require to hide the hypervisor signature. To do this, add the
hide_hypervisor_id
property to the flavor:# openstack --insecure flavor set vgpu-flavor --property hide_hypervisor_id=true
-
Create a boot volume from an image (for example, Ubuntu):
# openstack --insecure volume create --size 20 --image ubuntu vgpu-boot-volume
-
Create a virtual machine specifying
vgpu-flavor
andvgpu-boot-volume
. For example, to create the VMvgpu-vm
, run:# openstack --insecure server create --flavor vgpu-flavor --volume vgpu-boot-volume --network <network_name> vgpu-vm
The created virtual machine will have a virtual GPU of the type that is configured in the compute cluster.
-
-
If you want to use multiple vGPU types in the compute cluster, you need to manually create
CUSTOM_NVIDIA_XXX
traits, assign them to corresponding vGPU resource providers, and then proceed to create flavors and virtual machines with the assigned traits:-
List resource providers in the compute cluster to obtain their IDs. For example:
# openstack --insecure resource provider list +--------------------------------------+-----------------------------------------+------------+--------------------+----------------------+ | uuid | name | generation | root_provider_uuid | parent_provider_uuid | +--------------------------------------+-----------------------------------------+------------+--------------------+----------------------+ | 7d2ef259-42df-4ef8-8eaa-66c3b7448fc3 | node001.vstoragedomain_pci_0000_85_00_0 | 62 | 1f08c319-f270-<…> | 1f08c319-f270-<…> | | 94a84fc6-2f28-46d5-93e1-e588e347dd3b | node001.vstoragedomain_pci_0000_10_00_0 | 38 | 1f08c319-f270-<…> | 1f08c319-f270-<…> | | 41c177e3-6998-4e56-8d29-f98f72fef910 | node002.vstoragedomain_pci_0000_85_00_0 | 13 | 9dbc8c64-0048-<…> | 9dbc8c64-0048-<…> | | 7fd1d10f-9ceb-4cd1-acec-a1254755211b | node002.vstoragedomain_pci_0000_10_00_0 | 13 | 9dbc8c64-0048-<…> | 9dbc8c64-0048-<…> | +--------------------------------------+-----------------------------------------+------------+--------------------+----------------------+
-
Create custom traits that correspond to different GPU types. For example, to create the traits
CUSTOM_NVIDIA_231
andCUSTOM_NVIDIA_232
, run:# openstack --insecure trait create CUSTOM_NVIDIA_231 # openstack --insecure trait create CUSTOM_NVIDIA_232
-
Add the corresponding trait to the resource provider matching the GPU. For example:
# openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_231 7d2ef259-42df-4ef8-8eaa-66c3b7448fc3 +-------------------+ | name | +-------------------+ | CUSTOM_NVIDIA_231 | +-------------------+ # openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_231 94a84fc6-2f28-46d5-93e1-e588e347dd3b +-------------------+ | name | +-------------------+ | CUSTOM_NVIDIA_231 | +-------------------+
Now, the trait
CUSTOM_NVIDIA_231
is assigned to the vGPU resource providers of the nodenode001
. To assign the traitCUSTOM_NVIDIA_232
to the vGPU resource providers of the nodenode002
, run:# openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_232 41c177e3-6998-4e56-8d29-f98f72fef910 +-------------------+ | name | +-------------------+ | CUSTOM_NVIDIA_232 | +-------------------+ # openstack --insecure resource provider trait set --trait CUSTOM_NVIDIA_232 7fd1d10f-9ceb-4cd1-acec-a1254755211b +-------------------+ | name | +-------------------+ | CUSTOM_NVIDIA_232 | +-------------------+
-
Create flavors with the
resources
property specifying the number of vGPUs to use. For example, to create thevgpu231-flavor
flavor with 2 vCPUs and 4 GiB of RAM and thevgpu232-flavor
flavor with 4 vCPUs and 8 GiB of RAM, run:# openstack --insecure flavor create --ram 4096 --vcpus 2 --property resources:VGPU=1 --public vgpu231-flavor # openstack --insecure flavor create --ram 8192 --vcpus 4 --property resources:VGPU=1 --public vgpu232-flavor
-
Add the requested traits to your flavors. For example, to add the traits
CUSTOM_NVIDIA_231
andCUSTOM_NVIDIA_232
to the flavorsvgpu231-flavor
andvgpu232-flavor
, run:# openstack --insecure flavor set --property trait:CUSTOM_NVIDIA_231=required vgpu231-flavor # openstack --insecure flavor set --property trait:CUSTOM_NVIDIA_232=required vgpu232-flavor
-
Create a boot volume from an image (for example, Ubuntu):
# openstack --insecure volume create --size 20 --image ubuntu vgpu-boot-volume
-
Create virtual machines specifying the prepared flavors and
vgpu-boot-volume
. For example, to create the VMvgpu231-vm
with thevgpu231-flavor
flavor and the VMvgpu232-vm
with thevgpu232-flavor
flavor, run:# openstack --insecure server create --volume vgpu-boot-volume --flavor vgpu231-flavor --network <network_name> vgpu231-vm # openstack --insecure server create --volume vgpu-boot-volume --flavor vgpu232-flavor --network <network_name> vgpu232-vm
The created virtual machines will have virtual GPUs of different types that are configured in the compute cluster.
-
To check the GPU in a virtual machine
-
Log in to the VM via SSH:
# ssh <username>@<vm_ip_address>
-
Install the NVIDIA drivers:
# sudo apt update && sudo apt install -y nvidia-driver-470 nvidia-utils-470
-
Check the GPU by running:
# nvidia-smi
The GPU should be recognized and operational.