Compute alerts

Based on the metrics described in Compute metrics, the compute alerts are generated and displayed in the admin panel.

Compute service alerts

OpenStack Cinder API is down

OpenStack Cinder Scheduler is down

OpenStack Cinder Volume agent is down

OpenStack Glance API is down

OpenStack Heat API is down

OpenStack Magnum API is down

OpenStack Neutron API is down

OpenStack Neutron L3 agent is down

OpenStack Neutron OpenvSwitch agent is down

OpenStack Neutron Metadata agent is down

OpenStack Neutron DHCP agent is down

OpenStack Nova API is down

OpenStack Nova Compute is down

OpenStack Nova Conductor is down

OpenStack Nova Scheduler is down

OpenStack Octavia API is down

OpenStack Placement API is down

High request error rate for OpenStack API requests detected

Compute cluster alerts

Compute cluster has failed

Cluster is running out of vCPU resources

Cluster is out of vCPU resources

Cluster is running out of memory

Cluster is out of memory

Virtual machine error

Virtual machine state mismatch

Volume attachment details mismatch

Compute node alerts

Node is running out of vCPU resources: Node <node> with ID <id> has reached 80% of the vCPU allocation limit.

The compute node may soon experience the lack of vCPU resources that will lead to inability to accommodate new virtual machines. To avoid this, check the distribution of VMs in the compute cluster, and then migrate the VMs from the specified node to less loaded compute nodes.
Node is out of vCPU resources: Node <node> with ID <id> has reached 95% of the vCPU allocation limit.

The compute node will soon experience the lack of vCPU resources that will lead to inability to accommodate new virtual machines. To avoid this, check the distribution of VMs in the compute cluster, and then migrate the VMs from the specified node to less loaded compute nodes.
Node is running out of memory: Node <node> with ID <id> has reached 80% of the memory allocation limit.

The compute node may soon experience the lack of RAM resources that will lead to inability to accommodate new virtual machines. To avoid this, check the distribution of VMs in the compute cluster, and then migrate the VMs from the specified node to less loaded compute nodes.
Node is out of memory: Node <node> with ID <id> has reached 95% of the memory allocation limit.

The compute node will soon experience the lack of RAM resources that will lead to inability to accommodate new virtual machines. To avoid this, check the distribution of VMs in the compute cluster, and then migrate the VMs from the specified node to less loaded compute nodes.

Project quota alerts

Project is out of vCPU resources: Project <name> has reached 95% of the vCPU allocation limit.

The project will soon experience the lack of vCPU resources that will lead to inability to create new virtual machines. To avoid this, add more vCPUs to the project quota.
Project is out of memory: Project <name> has reached 95% of the memory allocation limit.

The project will soon experience the lack of RAM resources that will lead to inability to create new virtual machines. To avoid this, add more RAM to the project quota.
Project is out of floating IP addresses: Project <name> has reached 95% of the floating IP address allocation limit.

The project will soon experience the lack of floating IP addresses that will lead to inability to assign them to virtual machines. To avoid this, add more floating IPs to the project quota.
Network is out of IP addresses: Network <name> with ID <id> in project <name> has reached 95% of the IP address allocation limit.

The network will soon experience the lack of IP addresses that will lead to inability to connect new virtual machines to this network. To avoid this, add more allocation pools to the network.
Project is out of storage policy space: Project <name> has reached 95% of the <policy_name> storage policy allocation limit.

The project will soon experience the lack of storage policy space that will lead to inability to create new compute volumes with this storage policy. To avoid this, add more storage space to the project quota.

Other alerts

Libvirt service is down

Docker service is down

RabbitMQ node is down

RabbitMQ split brain detected

PostgreSQL database size is greater than 30 GB

PostgreSQL database "<name>" on node "<hostname>" is greater than 30 GB in size. Verify that deleted entries are archived or contact the technical support.

PostgreSQL database uses more than 50% of node root partition

PostgreSQL databases on node "<hostname>" with ID "<id>" use more than 50% of node root partition. Verify that deleted entries are archived or contact the technical support.