Monitoring node GPUs

To check the GPU status

Go to the Infrastructure > Nodes screen and click the node name. Open the GPUs tab to see a list of all graphics cards on the node with their statuses.

A GPU can have one of the following statuses:

Configured
The GPU is configured for passthrough or virtualization.
Not configured
The GPU is not configured for passthrough or virtualization. To configure your GPU, refer to Attaching host devices to virtual machines.

To display GPU details

Admin panel

  1. Go to the Infrastructure > Nodes screen and click the node name.
  2. Open the GPU tab, click a GPU to open its details.

GPU details include the GPU ID and alias name, state and status, associated flavors, device name and vendor, mode (if a GPU is configured).

Command-line interface

Use the following command:

vinfra node gpu list

For example, to view the details of GPUs on all nodes in the cluster, run:

# vinfra node gpu list
+---------------+---------------+---------+----------+--------------------+-----------+------------------------+-----------+------------+-----------------+-------------+--------------------+
| id            | node_id       | host    | status   | vendor             | vendor_id | device                 | device_id | alias      | mode            | pci_address | pci_domain_address |
+---------------+---------------+---------+----------+--------------------+-----------+------------------------+-----------+------------+-----------------+-------------+--------------------+
| 1269b15e<...> | 7e0bcaae<...> | node001 | attached | ASPEED Technology  | 1a03      | ASPEED Graphics Family | 2000      |            |                 | 04:00.0     | 0000:04:00.0       |
| be6c7558<...> | 7e0bcaae<...> | node001 | attached | NVIDIA Corporation | 10de      | TU104GL [Tesla T4]     | 1eb8      | gpu        | hostpassthrough | d8:00.0     | 0000:d8:00.0       |
| ed358a70<...> | 78a80b1a<...> | node003 | attached | ASPEED Technology  | 1a03      | ASPEED Graphics Family | 2000      |            |                 | 04:00.0     | 0000:04:00.0       |
| 194a3307<...> | 78a80b1a<...> | node003 | attached | NVIDIA Corporation | 10de      | TU104GL [Tesla T4]     | 1eb8      | gpu        | hostpassthrough | d8:00.0     | 0000:d8:00.0       ||
| 8ea18090<...> | bcedcef0<...> | node002 | attached | ASPEED Technology  | 1a03      | ASPEED Graphics Family | 2000      |            |                 | 04:00.0     | 0000:04:00.0       |
| db4f575a<...> | bcedcef0<...> | node002 | attached | NVIDIA Corporation | 10de      | TU104GL [Tesla T4]     | 1eb8      | nvidia-319 | vgpu            | d8:00.0     | 0000:d8:00.0       |
+-----------+-------------------+---------+----------+--------------------+-----------+------------------------+-----------+------------+-----------------+-------------+--------------------+

In the command output, GPU details include the GPU and node IDs, GPU status, vendor name and ID, device name and ID, alias name, mode (if a GPU is configured), PCI address.