Monitoring the compute cluster

After you create the compute cluster, you can monitor its status and statistics. Additionally, you can monitor separate compute nodes, virtual machines, and load balancers.

To view the compute cluster status

Click the cluster name at the bottom of the left menu. It can be one of the following:

Healthy
All compute cluster components and nodes operate normally.
Configuring
The compute cluster configuration (the default CPU model for VMs or the number of compute nodes) is changing.
Warning
The compute cluster operates normally but some issues have been detected.
Critical
The compute cluster has encountered a critical problem and is not operational.

To view the compute cluster statistics

Admin panel

Go to the Compute > Overview screen, which has the following charts:

Command-line interface

Use the following command:

# vinfra service compute stat
+----------+----------------------------------------------+
| Field    | Value                                        |
+----------+----------------------------------------------+
| compute  | block_capacity: 172874997760                 |
|          | block_usage: 22953803776                     |
|          | cpu_allocation_ratio: 8                      |
|          | cpu_usage: 13.3                              |
|          | ram_allocation_ratio: 1.0                    |
|          | vcpus: 7                                     |
|          | vcpus_free: 33                               |
|          | vm_mem_capacity: 35526971392                 |
|          | vm_mem_free: 22105198592                     |
|          | vm_mem_reserved: 13421772800                 |
|          | vm_mem_usage: 10906963968                    |
| datetime | 2022-10-05T13:21:12.447758                   |
| fenced   | physical_cpu_cores: 0                        |
|          | physical_cpu_usage: 0                        |
|          | physical_mem_total: 0                        |
|          | reserved_memory: 0                           |
|          | vcpus: 0                                     |
|          | vm_mem_capacity: 0                           |
| physical | block_capacity: 810773667840                 |
|          | block_free: 713247113216                     |
|          | cpu_cores: 12                                |
|          | cpu_usage: 25.8                              |
|          | mem_total: 75331031040                       |
|          | vcpus_total: 96                              |
| reserved | cpus: 7                                      |
|          | memory: 39804059648                          |
|          | vcpus: 56                                    |
| servers  | count: 5                                     |
|          | error: 0                                     |
|          | in_progress: 0                               |
|          | running: 4                                   |
|          | stopped: 1                                   |
|          | top:                                         |
|          |   disk:                                      |
|          |   - id: f3f522ac-05f7-4849-827d-d787a77edd56 |
|          |     name: k8s2-kvgcdwapwxbh-node-0           |
|          |     size: 6345576448                         |
|          |   - id: f53a6885-7740-4bf7-9765-4c2200b38c2f |
|          |     name: k8s3-dgl3edvjcbf3-node-0           |
|          |     size: 6133764096                         |
|          |   - id: 9a182bc6-54b7-47d9-9553-fcdf712d5a22 |
|          |     name: k8s2-kvgcdwapwxbh-master-0         |
|          |     size: 5385080832                         |
|          |   - id: 38de311b-f46b-4abe-a55b-2c04f677182e |
|          |     name: k8s3-dgl3edvjcbf3-master-0         |
|          |     size: 4925804544                         |
|          |   - id: b5d6bb82-6137-4748-b94a-c1056d4cd8c9 |
|          |     name: vm1                                |
|          |     size: 163577856                          |
|          |   memory:                                    |
|          |   - id: 38de311b-f46b-4abe-a55b-2c04f677182e |
|          |     name: k8s3-dgl3edvjcbf3-master-0         |
|          |     size: 3264253952                         |
|          |   - id: 9a182bc6-54b7-47d9-9553-fcdf712d5a22 |
|          |     name: k8s2-kvgcdwapwxbh-master-0         |
|          |     size: 3132764160                         |
|          |   - id: f53a6885-7740-4bf7-9765-4c2200b38c2f |
|          |     name: k8s3-dgl3edvjcbf3-node-0           |
|          |     size: 2259763200                         |
|          |   - id: f3f522ac-05f7-4849-827d-d787a77edd56 |
|          |     name: k8s2-kvgcdwapwxbh-node-0           |
|          |     size: 2250182656                         |
|          |   - id: b5d6bb82-6137-4748-b94a-c1056d4cd8c9 |
|          |     name: vm1                                |
|          |     size: 0                                  |
|          |   vcpus:                                     |
|          |   - count: 0.55                              |
|          |     id: 38de311b-f46b-4abe-a55b-2c04f677182e |
|          |     name: k8s3-dgl3edvjcbf3-master-0         |
|          |   - count: 0.51                              |
|          |     id: 9a182bc6-54b7-47d9-9553-fcdf712d5a22 |
|          |     name: k8s2-kvgcdwapwxbh-master-0         |
|          |   - count: 0.29                              |
|          |     id: f53a6885-7740-4bf7-9765-4c2200b38c2f |
|          |     name: k8s3-dgl3edvjcbf3-node-0           |
|          |   - count: 0.24                              |
|          |     id: f3f522ac-05f7-4849-827d-d787a77edd56 |
|          |     name: k8s2-kvgcdwapwxbh-node-0           |
|          |   - count: 0                                 |
|          |     id: b5d6bb82-6137-4748-b94a-c1056d4cd8c9 |
|          |     name: vm1                                |
+----------+----------------------------------------------+

To view more details about the compute cluster

Go to the Monitoring > Dashboard screen, and then click Grafana dashboard. A separate browser tab will open with preconfigured Grafana dashboards.

The Compute service status dashboard shows the status of the compute services and agents on all of the compute nodes. You can sort the displayed services per hostname, service name, and service status.

For the detailed monitoring of the compute resource allocation, use the Compute resource allocation dashboard. The charts on this dashboard show the usage of vCPUs, memory, storage space per storage policy, and floating IP addresses. You can view usage statistics for all domains and projects, or filter the data per specific domain or project.

To monitor the compute API requests, use the Compute service API details dashboard. The charts on this dashboard show the rate of successful and failed requests, as well as the 95th and 99th percentiles of response time, per 10-minute intervals. You can filter the displayed requests per compute service. The most important charts here are those of error request rate and response time. If you see spikes on them, you need to check the status of the corresponding services.

The RabbitMQ nodes, RabbitMQ messages, and RabbitMQ clients dashboards are intended for troubleshooting the RabbitMQ cluster by the support team. The PostgreSQL overview dashboard shows information about the PostgreSQL database size and replication status, as well as other database details. To see a detailed description for each chart, click the i icon in its left corner.