Enabling PCI passthrough and vGPU support

To enable PCI passthrough and vGPU support for the compute cluster, you need to create a configuration file in the YAML format, and then use it to reconfigure the compute cluster.

Prerequisites

To create the PCI passthrough and vGPU configuration file

Specify the identifier of a compute node that hosts PCI devices, and then add host devices that you want to pass through or virtualize:

  • To create virtual functions for a network adapter, add these lines:

    - device_type: sriov
      device: enp2s0
      physical_network: sriovnet
      num_vfs: 8

    where:

    • sriov is the device type for a network adapter
    • enp2s0 is the device name of a network adapter
    • sriovnet is an arbitrary name that will be used as an alias for a network adapter
    • num_vfs is the number of virtual functions to create for a network adapter

    The maximum number of virtual functions supported by a PCI device is specified in the /sys/class/net/<device_name>/device/sriov_totalvfs file. For example:

    # cat /sys/class/net/enp2s0/device/sriov_totalvfs
    63
  • To enable GPU passthrough, add these lines:

    - device_type: generic
      device: 1b36:0100
      alias: gpu

    where:

    • generic is the device type for a physical GPU that will be passed through
    • 1b36:0100 is the VID and PID of a physical GPU
    • gpu is an arbitrary name that will be used as an alias for a physical GPU
  • To enable a vGPU, with or without SR-IOV support, add these lines:

    - device_type: pgpu
      device: "0000:c1:00.0"
      vgpu_type: nvidia-558

    where:

    • pgpu is the device type for a physical GPU that will be virtualized
    • "0000:c1:00.0" is the PCI address of a physical GPU
    • nvidia-558 is the vGPU type that will be enabled for a physical GPU

The entire configuration file may look as follows:

# cat config.yaml
- node_id: c3b2321a-7c12-8456-42ce-8005ff937e12
  devices:
    - device_type: sriov
      device: enp2s0
      physical_network: sriovnet
      num_vfs: 8
    - device_type: generic
      device: 1b36:0100
      alias: gpu
    - device_type: pgpu
      device: "0000:01:00.0"
      vgpu_type: nvidia-232
- node_id: 1d6481c2-1fd5-406b-a0c7-330f24bd0e3d
  devices:
    - device_type: generic
      device: 10de:1eb8
      alias: gpu
    - device_type: pgpu
      device: "0000:03:00.0"
      vgpu_type: nvidia-224
    - device_type: pgpu
      device: "0000:c1:00.0"
      vgpu_type: nvidia-558

To configure the compute cluster for PCI passthrough and vGPU support

Pass the configuration file to the vinfra service compute set command. For example:

# vinfra service compute set --pci-passthrough-config config.yaml

If the compute configuration fails

Check whether the following error appears in /var/log/vstorage-ui-backend/ansible.log:

2021-09-23 16:42:59,796 p=32130 u=vstoradmin | fatal: [32c8461b-92ec-48c3-ae02-
4d12194acd02]: FAILED! => {"changed": true, "cmd": "echo 4 > /sys/class/net/
enp103s0f1/device/sriov_numvfs", "delta": "0:00:00.127417", "end": "2021-09-23 
19:42:59.784281", "msg": "non-zero return code", "rc": 1, "start": "2021-09-23 
19:42:59.656864", "stderr": "/bin/sh: line 0: echo: write error: Cannot allocate 
memory", "stderr_lines": ["/bin/sh: line 0: echo: write error: Cannot allocate memory"], 
"stdout": "", "stdout_lines": []}

In this case, run the the pci-helper.py script, and reboot the node:

# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py enable-iommu --pci-realloc
# reboot

When the node is up again, repeat the vinfra service compute set command.

To check that a node has vGPU resources for allocation

List resource providers in the compute cluster to obtain their IDs. For example:

# openstack --insecure resource provider list
+--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+
| uuid                                 | name                                    | generation | root_provider_uuid                   | parent_provider_uuid                 |
+--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+
| 359cccf7-9c64-4edc-a35d-f4673e485a04 | node001.vstoragedomain_pci_0000_01_00_0 |          1 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | 4936695a-4711-425a-b0e4-fdab5e4688d6 |
| b8443d1b-b941-4bf5-ab4b-2dc7c64ac7d1 | node001.vstoragedomain_pci_0000_81_00_0 |          1 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | 4936695a-4711-425a-b0e4-fdab5e4688d6 |
| 4936695a-4711-425a-b0e4-fdab5e4688d6 | node001.vstoragedomain                  |        823 | 4936695a-4711-425a-b0e4-fdab5e4688d6 | None                                 |
+--------------------------------------+-----------------------------------------+------------+--------------------------------------+--------------------------------------+

In this output, the resource provider with the ID 4936695a-4711-425a-b0e4-fdab5e4688d6 has two child resource providers for two physical GPUs with PCI addresses 0000_01_00_0 and 0000_81_00_0.

Use the obtained ID of a child resource provider to list its inventory. For example:

# openstack --insecure resource provider inventory list 359cccf7-9c64-4edc-a35d-f4673e485a04
+----------------+------------------+----------+----------+-----------+----------+-------+
| resource_class | allocation_ratio | max_unit | reserved | step_size | min_unit | total |
+----------------+------------------+----------+----------+-----------+----------+-------+
| VGPU           |              1.0 |        8 |        0 |         1 |        1 |     8 |
+----------------+------------------+----------+----------+-----------+----------+-------+

The child resource provider has vGPU resources that can be allocated to virtual machines.