Configuring retention policy for Prometheus metrics

The Prometheus service used for monitoring the cluster runs and stores its data on the management node. By default, Prometheus metrics are stored for seven days. This retention period can be insufficient for troubleshooting purposes. You can increase it manually by modifying the Prometheus configuration file.

However, with a long retention period, the root partition where the data is stored may run out of free space. To prevent this, you can define the maximum size for the Prometheus metrics. The oldest data will be removed first.

To increase the retention period

  1. On the management node, open the file /etc/sysconfig/prometheus to edit, set the needed retention period for the STORAGE_RETENTION option, and then save your changes. For example:

    STORAGE_RETENTION="--storage.tsdb.retention.time=30d"
    
  2. Restart the Prometheus service:

    systemctl restart prometheus.service
    

If high availability is enabled in the storage cluster, repeat these steps for the other two management nodes.

To change the time retention policy to the size retention policy

  1. On the management node, open the file /etc/sysconfig/prometheus to edit, change the flag for the STORAGE_RETENTION option, and then save your changes. For example:

    STORAGE_RETENTION="--storage.tsdb.retention.size=10GB"
    
  2. Restart the Prometheus service:

    systemctl restart prometheus.service
    

If high availability is enabled in the storage cluster, repeat these steps for the other two management nodes.