Calculating disk health
You can monitor node disks by using the vstorage-disks-monitor
service. This service runs on every management node and queries chunk server (CS) metrics from the Prometheus service for further analysis.
vstorage-disks-monitor
detects CSes that are not responding and marks them as ill (unresponsive). To avoid degrading the cluster performance, such CSes are fenced from the cluster I/O.
The service also calculates disk health, in percent, based on each metric weight. Weights can be configured in the /etc/disks-monitor/analyzers.yml configuration file. The service logs are stored in /var/log/disks-monitor/disks-monitor.log.
The service can work in two modes:
- As a daemon if you use the
vstorage-disks-monitor sidecar
command - As a tool for listing disk statuses and alerts if you run
vstorage-disks-monitor health
andvstorage-disks-monitor alerts
You can disable fencing ill CSes by running the vstorage-disks-monitor sidecar ‑‑fencing.enable
command.
Limitations
- Detection of unresponsive disks is disabled in clusters deployed on virtual machines.