Compute alerts

Based on the metrics described in Compute metrics, the compute alerts are generated and displayed in the admin panel.

Compute service alerts

Keystone API service is down

OpenStack service API upstream is down

All OpenStack service API upstreams are down

OpenStack Cinder Scheduler is down

OpenStack Cinder Volume agent is down

OpenStack Neutron L3 agent is down

OpenStack Neutron Open vSwitch agent is down

OpenStack Neutron Metadata agent is down

OpenStack Neutron DHCP agent is down

OpenStack Nova Compute is down

OpenStack Nova Conductor is down

OpenStack Nova Scheduler is down

OpenStack Octavia Provisioning Worker v1 is down

OpenStack Octavia Provisioning Worker v2 is down

OpenStack Octavia Housekeeping service is down

OpenStack Octavia HealthManager service is down

High request error rate for OpenStack API requests detected

Compute cluster alerts

Compute cluster has failed

Cluster is running out of vCPU resources

Cluster is out of vCPU resources

Cluster is running out of memory

Cluster is out of memory

Virtual machine error

Virtual machine state mismatch

Volume attachment details mismatch

Virtual network port check failed

Backup plan failed

Virtual router HA has more than one active L3 agent

Virtual router HA with ID <router_id> has more than one active L3 agent. Please contact the technical support.

Virtual router HA has no active L3 agent

Virtual router HA with ID <router_id> has no active L3 agent. Please contact the technical support.

Virtual router SNAT-related port has invalid host binding

Virtual router SNAT-related port with ID <id> is bound to the Standby HA router node. Please contact the technical support.

Virtual router gateway port has invalid host binding

Virtual router gateway port with ID <id> is bound to the Standby HA router node. Please contact the technical support.

Neutron bridge mapping not found

Physical network "<physical_network>" is not found in the bridge mapping on node "<hostname>". Virtual network "<virtual_network>" on this node is most likely not functioning. Please contact the technical support.

Virtual DHCP server is unavailable from node

Built-in DHCP server for virtual network "<network_id>" is not available from node "<hostname>". Please check the neutron-dhcp-agent service or contact the technical support.

Virtual DHCP server is unavailable

Built-in DHCP server for virtual network "<network_id>" is not available from cluster nodes. Please check the neutron-dhcp-agent service or contact the technical support.

Virtual DHCP server HA degraded on node

Only one built-in DHCP server for virtual network "<network_id>" is reachable from node "<hostname>". DHCP high availability entered the degraded state. Please check the neutron-dhcp-agent service or contact the technical support.

Virtual DHCP server HA degraded

Only one built-in DHCP server for virtual network "<network_id>" is reachable from cluster nodes. DHCP high availability entered the degraded state. Please check the neutron-dhcp-agent service or contact the technical support.

Unrecognized DHCP servers detected from node

Built-in DHCP service for virtual network "<network_id>" may be malfunctioning on node "<hostname>". Please ensure that virtual machines are receiving correct DHCP addresses or contact the technical support.

Unrecognized DHCP servers detected

Built-in DHCP service for virtual network "<network_id>" may be malfunctioning. Please ensure that virtual machines are receiving correct DHCP addresses or contact the technical support.

Compute node alerts

Node is running out of vCPU resources

Node is out of vCPU resources

Node is running out of memory

Node is out of memory

Node had a fenced state for 1 hour

For the last 2 hours node <node> with ID <id> had a fenced state at least for 1 hour.

Domain quota alerts

Domain is out of vCPU resources

Domain is out of vCPU resources

Domain is out of memory

Domain is out of memory

Domain is out of storage policy space

Domain is out of storage policy space

Project quota alerts

Project is out of vCPU resources

Project is out of memory

Project is out of floating IP addresses

Network is out of IP addresses

Project is out of storage policy space

Other alerts

Libvirt service is down

Docker service is down

RabbitMQ node is down

RabbitMQ split brain detected

PostgreSQL database size is greater than 30 GB

PostgreSQL database "<name>" on node "<hostname>" is greater than 30 GB in size. Verify that deleted entries are archived or contact the technical support.

PostgreSQL database uses more than 50% of node root partition

PostgreSQL databases on node "<hostname>" with ID "<id>" use more than 50% of node root partition. Verify that deleted entries are archived or contact the technical support.

Kafka SSL CA certificate will expire in less than 30 days

Kafka SSL CA certificate will expire in <number> days. Please renew the certificate.

Kafka SSL CA certificate has expired

Kafka SSL CA certificate has expired. Please renew the certificate.

Kafka SSL client certificate will expire in less than 30 days

Kafka SSL client certificate will expire in <number> days. Please renew the certificate.

Kafka SSL client certificate has expired

Kafka SSL client certificate has expired. Please renew the certificate.