Alert list
The following alerts are generated and displayed in the admin panel:
Title | Message | Severity |
---|---|---|
License alerts | ||
License is not loaded | License is not installed. | warning |
License expired | The license of cluster “<cluster_name>” has expired. Сontact your reseller to update your license immediately! | critical |
Cluster alerts | ||
Cluster is out of space | Cluster has just <free_space> TB (<free_space_in_percent>%) of physical storage space left. You may want to free some space or add more storage capacity. | warning |
Сluster “<cluster_name>” has run out of storage space allowed by license. No more data can be written. Please contact your reseller to update your license immediately! | warning | |
Not enough cluster nodes | Cluster “<cluster_name>” has only {1,2} node(s) instead of the recommended minimum of 3. Add {2,1} or more nodes to the cluster. | warning |
High availability for the admin panel must be configured | Configure high availability for the admin panel in Settings > Management node. Otherwise the admin panel will be a single point of failure. | critical |
Management node backup does not exist | Management node backup is older than <number_of_days> days. | critical |
The last management node backup has failed, does not exist, or is too old. | critical | |
Changes to the management database are not replicated | Changes to the management database are not replicated to the node "<hostname>" because it is offline. Check the node's state and connectivity. | critical |
Changes to the management database are not replicated to the node "<hostname>". Please contact the technical support. | ||
Cluster connectivity alerts | ||
Cluster network connectivity problem | All nodes have network connectivity problems: unstable connectivity via network "<network_name>" due to packet loss. | critical |
All nodes have network connectivity problems: no connectivity via network "<network_name>". | critical | |
Node network connectivity problem | Node "<hostname>" has network connectivity problems: unstable connectivity via network "<network_name>" due to the loss of all MTU-sized packets. | critical |
Node "<hostname>" has network connectivity problems: unstable connectivity via network "<network_name>" due to the loss of some MTU-sized packets. | critical | |
Node "<hostname>" has network connectivity problems: unstable connectivity via network "<network_name>" due to packet loss. | critical | |
Node "<hostname>" has network connectivity problems: no connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>". | critical | |
Node "<hostname>" has network connectivity problems: unstable connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>" due to the loss of all MTU-sized packets. | critical | |
Node "<hostname>" has network connectivity problems: unstable connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>" due to packet loss. | critical | |
Node "<hostname>" has network connectivity problems: unstable connectivity to node "<hostname>" with interface "<iface>" via interface "<iface>" due to the loss of some MTU-sized packets. | critical | |
MTU mismatch | Some interfaces have MTU that differs from other interfaces in the same network: network "<network_name>" interface@host "<iface>@<hostname>". | critical |
Metadata service alerts | ||
Not enough metadata disks | Cluster “<cluster_name>” has only one MDS. There is only one disk with the metadata role at the moment. Losing this disk will completely destroy all cluster data irrespective of the redundancy schema. | critical |
Cluster “<cluster_name>” requires more disks with the metadata role. Losing one more MDS will halt cluster operation. | warning | |
Configuration warning | Node “<hostname>” has more than one metadata service located on it. It is recommended to have only one metadata service per node. Delete the extra metadata service(s) from this node and create them on other nodes instead. | warning |
Cluster “<cluster_name>” has four metadata services. This configuration slows down the cluster performance and does not improve its availability. For a cluster of four nodes, it is enough to configure three MDSes. Delete an extra MDS from one of the cluster nodes. | ||
Cluster “<cluster_name>” has more than five metadata services. This configuration slows down the cluster performance and does not improve its availability. For a large cluster, it is enough to configure five MDSes. Delete extra MDSes from the cluster nodes. | ||
Service failed | Metadata service #<id> is in the “<status>” state. Node: <hostname>. Disk: <disk_name>. Disk serial: <disk_serial>. | warning |
Metadata disk is out of space | Metadata disk on node “<hostname>” is running out of space. | warning |
Chunk service alerts | ||
Not enough disks with storage role | Cluster “<cluster_name>” has no disks with the storage role. | warning |
Cluster “<cluster_name>” has too few available CSes. | warning | |
Service failed | Storage service #<id> is in the “<status>” state. Node: <hostname>. Disk: <disk_name>. Disk serial: <disk_serial>. | warning |
CS configuration is not optimal | CS#<cs_id> on tier <tier> has incorrect journalling settings. | warning |
Encryption is disabled for CS#<cs_id> on tier <tier> but is enabled for other CSes on the same tier. | warning | |
Storage disk is slow | Disk <disk_name> (CS#<cs_id>) on node <hostname> is slow and needs to be replaced. | warning |
Disk cache settings are not optimal | Disk <disk_name> (CS#<cs_id> on node <hostname> has cache settings different from other disks of the same tier. | warning |
Node alerts | ||
Node is offline | Node “<hostname>” is offline. | warning |
Node got offline too many times | Node “<hostname>” got offline too many times last hour. | warning |
Software updates exist | Software updates exist for the node “<hostname>”. | warning |
Kernel is outdated | Node “<hostname>” is not running the latest kernel. | warning |
OOM killer triggered | OOM killer has been triggered on node “<hostname>”. | warning |
Time is not synced | Time on node “<hostname>” differs from time on backend node by more than 5 seconds. | warning |
No Internet access | Cluster node <hostname> cannot reach the repository. Make sure that all cluster nodes have Internet access. | warning |
Incompatible hardware detected | Incompatible hardware detected on node "<hostname>": <hardware_list>. Using Mellanox and AMD may lead to data loss. Please double check that SR-IOV is properly enabled. | critical |
Disk alerts | ||
S.M.A.R.T. warning | Disk “<disk_name>”(<serial>) on node “<hostname>” has failed a S.M.A.R.T. check. | critical |
Disk error | Disk “<disk_name>” (<serial>) failed on node “<hostname>”. | critical |
Disk is out of space | Root partition on node “<hostname>” is running out of space. | warning |
Disk write cache is enabled | Disk write cache is enabled for disk “<disk_name>” on node “<hostname>”. Disable it to avoid potential data loss in case of a power outage. | warning |
Disk write cache status unknown | Cannot determine the status of write cache for disk “<disk_name>” on node “<hostname>”. | warning |
Network alerts | ||
Network warning | Network interface “<iface_name>” has incorrect settings: <duplex> duplex and <speed> speed. | warning |
Network interface “<iface_name>” on node “<hostname>” is missing important features (or has them disabled): “<feature_name>”. | warning | |
Network interface “<iface_name>” on node “<hostname>” is not in the full duplex mode. | warning | |
Network interface “<iface_name>” on node “<hostname>” has speed lower than the minimally required 1 Gbps. | warning | |
Network interface “<iface_name>” on node “<hostname>” has an undefined speed. | warning | |
Other alerts | ||
Compute cluster has failed | Compute cluster has failed. Unable to manage virtual machines. | critical |
Redundancy warning | iSCSI LUN <lun_id> of target group “<target_group>” is set to failure domain “disk” even though <number_of_nodes> nodes are available. It is recommended to set the failure domain to “host” so that the LUN can survive host failures in addition to disk failures. | warning |
S3 is set to failure domain “disk” even though <number_of_nodes> nodes are available. It is recommended to set the failure domain to “host” so that S3 can survive host failures in addition to disk failures. | warning | |
Certificate expiration | Acronis Backup Gateway certificate has expired. All backup operations have been stopped. Update the certificate on the Backup Gateway screen. | critical |
Acronis Backup Gateway certificate will expire soon. Update the certificate on the Backup Gateway screen. | warning | |
Acronis Backup Gateway certificate will expire on "<expiration_date>". Update the certificate on the Backup Gateway screen. | ||
iSCSI major upgrade failed | iSCSI major upgrade failed. Will be retried… | critical |
S3 cluster misconfiguration |
The S3 cluster configuration is not highly available. If one S3 node fails, the entire S3 cluster may become non-operational. |
warning |