Switching between GPU passthrough and vGPU

If you have already enabled GPU passthrough for the compute cluster but want to use vGPU instead, or vice versa, you will need to reconfigure the node with the physical GPU and the compute cluster.

Prerequisites

The compute cluster is reconfigured for GPU passthrough or vGPU support, as described in Enabling PCI passthrough and vGPU support.
Ensure that the required GPU is not used by any virtual machine.

To reconfigure the compute cluster from GPU passthrough to vGPU support

On the node with the physical GPU, find out the service that is associated with the GPU. For example:

# systemctl | grep stub
  pcistub-0000:01:00.0.service       loaded active exited    Bind device to pci-stub driver

Disable this service. For example:

# systemctl disable pcistub-0000:01:00.0.service

Reboot the node to apply your changes:
```
# reboot
```
Install the vGPU NVIDIA driver:
1. Install the kernel-devel and dkms packages:
```
# dnf install kernel-devel dkms
```
2. Enable and start the dkms service:
```
# systemctl enable dkms.service 
# systemctl start dkms.service
```
3. Install the vGPU KVM kernel module from the NVIDIA GRID package with the --dkms option:
```
# bash NVIDIA-Linux-x86_64-xxx.xx.xx-vgpu-kvm*.run --dkms
```
4. Re-create the Linux boot image by running:
```
# dracut -f
```
[For modern GPUs with SR-IOV support] Enable the virtual functions for your GPU:
```
# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py nvidia-sriov-mgr --enable
```
Modify the configuration file:
- Change the device_type from generic to pgpu
- Specify the GPU's PCI address in device
- Remove the alias field
- Add the desired vgpu_type
As a result, your configuration file config.yaml may look as follows:
```
- node_id: c3b2321a-7c12-8456-42ce-8005ff937e12
  devices:
    - device_type: pgpu
      device: "0000:01:00.0"
      vgpu_type: nvidia-224
```
Pass the configuration file to the vinfra service compute set command. For example:
```
# vinfra service compute set --pci-passthrough-config config.yaml
```

To reconfigure the compute cluster from vGPU support to GPU passthrough

Remove the vGPU-related information from the configuration file config.yaml. For example, you may need to remove these lines:
```
- device_type: pgpu
  device: "0000:01:00.0"
  vgpu_type: nvidia-224
```
Reconfigure the compute cluster by using the updated configuration file config.yaml. For example:
```
# vinfra service compute set --pci-passthrough-config config.yaml
```
[For modern GPUs with SR-IOV support] Disable the virtual functions for your GPU:
```
# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py nvidia-sriov-mgr --disable
```

Uninstall the vGPU NVIDIA driver:

# bash NVIDIA-Linux-x86_64-xxx.xx.xx-vgpu-kvm*.run --uninstall

On the node with the physical GPU, run the pci-helper.py script to assign the pci-stub driver to the GPU at its PCI address. For example:
```
# /usr/libexec/vstorage-ui-agent/bin/pci-helper.py bind-to-stub 0000:01:00.0
```
Add the GPU card to the configuration file as a generic device. For example:
```
- device_type: generic
  device: 1b36:0100
  alias: gpu
```
Pass the configuration file to the vinfra service compute set command. For example:
```
# vinfra service compute set --pci-passthrough-config config.yaml
```