Configuring SR-IOV support

Limitations

  • Virtual machines with attached PCI devices cannot be live migrated.

Prerequisites

Procedure overview

  1. Prepare a compute node for SR-IOV support.
  2. Reconfigure the compute cluster to enable SR-IOV support.
  3. Create a virtual machine with an SR-IOV network port.

To prepare a node for SR-IOV

  1. List all network adapters on a node and obtain their VID and PID:

    # lspci -nnD | grep Ethernet
    0000:00:03.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
    0000:00:04.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]

    [15b3:1017] is the VID and PID of the network adapter.

  2. Check that the chosen network adapter supports SR-IOV by using its VID and PID:

    # lspci -vv -d 15b3:1017 | grep SR-IOV
    Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
    Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
  3. Enable IOMMU on the node by running the pci-helper.py enable-iommu script and reboot the node to apply the changes:

    # /usr/libexec/vstorage-ui-agent/bin/pci-helper.py enable-iommu
    # reboot

    The script works for both Intel and AMD processors.

  4. Verify that IOMMU is enabled in the dmesg output:

    # dmesg | grep -e DMAR -e IOMMU
    [    0.000000] DMAR: IOMMU enabled
  5. [For NVIDIA Mellanox network adapters] Enable SR-IOV in firmware:

    1. Download Mellanox Firmware Tools (MFT) from the official website and extract the archive on the node. For example:

      # wget https://www.mellanox.com/downloads/MFT/mft-4.17.0-106-x86_64-rpm.tgz
      # tar -xvzf mft-4.17.0-106-x86_64-rpm.tgz
    2. Install the package, and then start Mellanox Software Tools (MST):

      # yum install rpm-build
      # . mft-4.17.0-106-x86_64-rpm/install.sh
      # mst start
    3. Determine the MST device path:

      # mst status
    4. Query the current configuration:

      # mlxconfig -d /dev/mst/mt4119_pciconf0 q
      ...
      Configurations:
      ...
               NUM_OF_VFS            4               # Number of activated VFs
               SRIOV_EN              True(1)         # SR-IOV is enabled
      ...
    5. Set the desired values, if necessary. For example, to increase the number of virtual functions to 8, run:

      # mlxconfig -d /dev/mst/mt4119_pciconf0 set SRIOV_EN=1 NUM_OF_VFS=8
    6. Reboot the node to apply the changes.

To enable SR-IOV support for the compute cluster

  1. Create a configuration file in the YAML format. For example:

    # cat << EOF > pci-passthrough.yaml
    - node_id: c3b2321a-7c12-8456-42ce-8005ff937e12
      devices:
        - device_type: sriov
          device: enp2s0
          physical_network: sriovnet
          num_vfs: 8
    EOF

    In this example:

    • node_id is the UUID of the compute node that hosts a network adapter
    • sriov is the device type for a network adapter
    • enp2s0 is the device name of a network adapter
    • sriovnet is an arbitrary name that will be used as an alias for a network adapter
    • num_vfs is the number of virtual functions to create for a network adapter

    The maximum number of virtual functions supported by a PCI device is specified in the /sys/class/net/<device_name>/device/sriov_totalvfs file. For example:

    # cat /sys/class/net/enp2s0/device/sriov_totalvfs
    63
  2. Reconfigure the compute cluster by using this configuration file:

    # vinfra service compute set --pci-passthrough-config pci-passthrough.yaml
    +---------+--------------------------------------+
    | Field   | Value                                |
    +---------+--------------------------------------+
    | task_id | 89c8a6c4-f480-424e-ab44-c2f4e2976eb9 |
    +---------+--------------------------------------+
  3. Check the status of the task:

    # vinfra task show 89c8a6c4-f480-424e-ab44-c2f4e2976eb9

To create a virtual machine with an SR-IOV network port

  1. Create a physical compute network specifying the network adapter alias from the pci-passthrough.yaml file and the default vNIC type direct. You also need to disable the built-in DHCP server and specify the desired IP address range. For example, to create the sriov-network network with the 10.10.10.0/24 CIDR, run:

    # vinfra service compute network create sriov-network --physical-network sriovnet --default-vnic-type direct \
    --no-dhcp --cidr 10.10.10.0/24
  2. Create a virtual machine specifying the new network. For example, to create the VM sriov-vm from the template centos7 and with the large flavor, run:

    # vinfra service compute server create sriov-vm --network id=sriov-network --volume source=image,size=11,id=centos7 --flavor large