Calculating disk health

You can monitor node disks by using the vstorage-disks-monitor service. This service runs on every management node and queries chunk server (CS) metrics from the Prometheus service for further analysis. vstorage-disks-monitor detects CSes that are not responding and marks them as ill (unresponsive). To avoid degrading the cluster performance, such CSes are fenced from the cluster I/O.

The service also calculates disk health, in percent, based on each metric weight. Weights can be configured in the /etc/disks-monitor/analyzers.yml configuration file. The service logs are stored in /var/log/disks-monitor/disks-monitor.log.

The service can work in two modes:

  • As a daemon if you use the vstorage-disks-monitor sidecar command
  • As a tool for listing disk statuses and alerts if you run vstorage-disks-monitor health and vstorage-disks-monitor alerts

You can disable fencing ill CSes by running the vstorage-disks-monitor sidecar ‑‑fencing.enable command.

Limitations

  • Detection of unresponsive disks is disabled in clusters deployed on virtual machines.