Infrastructure alerts

The following infrastructure alerts are generated and displayed in the admin panel:

Title Message Severity
License alerts
License is not loaded License is not installed. warning
License expired The license of cluster “<cluster_name>” has expired. Сontact your reseller to update your license immediately! critical
Cluster alerts
Cluster is out of space Cluster has just <free_space> TB (<free_space_in_percent>%) of physical storage space left. You may want to free some space or add more storage capacity. warning
Сluster “<cluster_name>” has run out of storage space allowed by license. No more data can be written. Please contact your reseller to update your license immediately! warning
Licensed storage capacity is low Cluster has reached 80% of licensed storage capacity. warning
Licensed storage capacity is critically low Cluster has reached 90% of licensed storage capacity. critical
Not enough cluster nodes Cluster “<cluster_name>” has only {1,2} node(s) instead of the recommended minimum of 3. Add {2,1} or more nodes to the cluster. warning
High availability for the admin panel must be configured Configure high availability for the admin panel in Settings > Management node. Otherwise the admin panel will be a single point of failure. critical
Management node backup does not exist Management node backup is older than <number_of_days> days. critical
The last management node backup has failed, does not exist, or is too old. critical
Changes to the management database are not replicated Changes to the management database are not replicated to the node "<hostname>" because it is offline. Check the node's state and connectivity. critical
Changes to the management database are not replicated to the node "<hostname>". Please contact the technical support.
Cluster connectivity alerts
Cluster network connectivity problem All nodes have network connectivity problems: unstable connectivity via network "<network_name>" due to packet loss. critical
All nodes have network connectivity problems: no connectivity via network "<network_name>". critical
Node network connectivity problem Node "<hostname>" has network connectivity problems: unstable connectivity via network "<network_name>" due to the loss of all MTU-sized packets. critical
Node "<hostname>" has network connectivity problems: unstable connectivity via network "<network_name>" due to the loss of some MTU-sized packets. critical
Node "<hostname>" has network connectivity problems: unstable connectivity via network "<network_name>" due to packet loss. critical
Node "<hostname>" has network connectivity problems: no connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>". critical
Node "<hostname>" has network connectivity problems: unstable connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>" due to the loss of all MTU-sized packets. critical
Node "<hostname>" has network connectivity problems: unstable connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>" due to packet loss. critical
Node "<hostname>" has network connectivity problems: unstable connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>" due to the loss of some MTU-sized packets. critical
MTU mismatch Some interfaces have MTU that differs from other interfaces in the same network: network "<network_name>" interface@host "<iface>@<hostname>". critical
Node alerts
Node is offline Node “<hostname>” is offline. warning
Node got offline too many times Node “<hostname>” got offline too many times last hour. warning
Kernel is outdated Node “<hostname>” is not running the latest kernel. warning
OOM killer triggered OOM killer has been triggered on node “<hostname>”. warning
Time is not synced Time on node “<hostname>” differs from time on backend node by more than 5 seconds. warning
No Internet access Cluster node <hostname> cannot reach the repository. Make sure that all cluster nodes have Internet access. warning
Incompatible hardware detected Incompatible hardware detected on node "<hostname>": <hardware_list>. Using Mellanox and AMD may lead to data loss. Please double check that SR-IOV is properly enabled. critical
Swap space is running low <swap_proportion>% of swap is used on node "<hostname>". critical
Node has high CPU usage Node <hostname> has CPU usage higher than 90%. The current value is <value>%. warning
Node has high memory usage Node <hostname> has memory usage higher than 95%. The current value is <value>%. warning
Node has high disk I/O usage Disk /dev/<disk_name> on node <hostname> has I/O usage higher than 85%. The current value is <value>%. warning
Node has high receive packet loss rate Node <hostname> has <value> receive packet loss rate reported by job <job_name>. warning
Node has high transmit packet loss rate Node <hostname> has <value> transmit packet loss rate reported by job <job_name>. warning
Node has high receive packet error rate Node <hostname> has <value> receive packet error rate reported by job <job_name>. warning
Node has high transmit packet error rate Node <hostname> has <value> transmit packet error rate reported by job <job_name>. warning
Disk alerts
S.M.A.R.T. warning Disk “<disk_name>”(<serial>) on node “<hostname>” has failed a S.M.A.R.T. check. critical
Disk error Disk “<disk_name>” (<serial>) failed on node “<hostname>”. critical
Disk is out of space Root partition on node “<hostname>” is running out of space. warning
Disk write cache is enabled Disk write cache is enabled for disk “<disk_name>” on node “<hostname>”. Disable it to avoid potential data loss in case of a power outage. warning
Disk write cache status unknown Cannot determine the status of write cache for disk “<disk_name>” on node “<hostname>”. warning
Software RAID is not fully synced Software RAID <disk_name> on node <hostname> is <value>% synced. warning
Systemd service is flapping Systemd service <service_name> on node <hostname> has changed its state more than 5 times in 5 minutes or 15 times in one hour. critical
Network alerts
Network warning Network interface “<iface_name>” has incorrect settings: <duplex> duplex and <speed> speed. warning
Network interface “<iface_name>” on node “<hostname>” is missing important features (or has them disabled): “<feature_name>”. warning
Network interface “<iface_name>” on node “<hostname>” is not in the full duplex mode. warning
Network interface “<iface_name>” on node “<hostname>” has speed lower than the minimally required 1 Gbps. warning
Network interface “<iface_name>” on node “<hostname>” has an undefined speed. warning
Network interface is flapping Network interface <iface_name> on node <hostname> is flapping. warning
Network bond is not redundant Network bond <iface_name> on node <hostname> is missing <number_of_ifaces> subordinate interface(s). critical
Update alerts
Software updates exist Software updates exist for the node <hostname>. Current version: <current_version>. Available version: <available_version>. information
Update check failed Update check failed on the node <hostname>. Please check access to the update repository. warning
Multiple update checks failed Update checks failed multiple times on the node <hostname>. Please check access to the update repository. critical
Update download failed Update download failed on the node <hostname>. critical
Node update failed Software update failed on the node <hostname>. critical
Update failed Update failed for the management panel and compute API. critical
Cluster update failed Update failed for the cluster. critical
Entering maintenance for update failed Entering maintenance failed while updating the node <hostname>. critical
Service alerts
Compute cluster has failed Compute cluster has failed. Unable to manage virtual machines. critical
Certificate expiration Acronis Backup Gateway certificate has expired. All backup operations have been stopped. Update the certificate on the Backup Gateway screen. critical
Acronis Backup Gateway certificate will expire soon. Update the certificate on the Backup Gateway screen. warning
Acronis Backup Gateway certificate will expire on "<expiration_date>". Update the certificate on the Backup Gateway screen.
Redundancy warning iSCSI LUN <lun_id> of target group “<target_group>” is set to failure domain “disk” even though <number_of_nodes> nodes are available. It is recommended to set the failure domain to “host” so that the LUN can survive host failures in addition to disk failures. warning
iSCSI major upgrade failed iSCSI major upgrade failed. Will be retried… critical
NFS service has unavailable FS services Some File services are not running on <node>. Check the service status in the command-line interface. warning