Object storage metrics
Metrics used for monitoring object storage are configured in the Prometheus recording rules and can be found in these files on any node in the cluster:
- /var/lib/prometheus/rules/s3.rules
- /var/lib/prometheus/rules/ostor.rules
Metrics that are used to generate object storage alerts are added to the alerting rules in /var/lib/prometheus/alerts/s3.rules. These metrics are described in the table:
Metric | Description |
---|---|
instance_vol_svc:ostor_s3gw_req:rate5m
|
Number of all requests per second by a particular S3 gateway service for 5 minutes |
instance_vol_svc:ostor_s3gw_req_cancelled:rate5m
|
Number of canceled requests per second by a particular S3 gateway service for 5 minutes |
instance_vol_svc:ostor_req_server_err:rate5m
|
Number of failed requests with a server error (5XX status code) per second by a particular S3 gateway service for 5 minutes |
instance_vol_svc:ostor_s3gw_get_req_latency_ms_bucket:rate5m
|
Current GET request latency by a particular S3 gateway service for 5 minutes, for each bucket |
instance_vol_svc:ostor_commit_latency_us_bucket:rate5m
|
Current commit latency by the Object storage service for 5 minutes, for each bucket |
instance_vol_svc_req:ostor_os_req_latency_ms_bucket:rate5m
|
Current request latency by a particular OS service for 5 minutes, for each bucket |
instance_vol_svc_req:ostor_ns_req_latency_ms_bucket:rate5m
|
Current request latency by a particular NS service for 5 minutes, for each bucket |
pcs_process_inactive_seconds_total
|
Total amount of time a process has been inactive |
process_cpu_seconds_total
|
Total amount of time a process has used CPU |
ostor_svc_start_failed_count_total
|
Total number of failed attempts to start a service |
ostor_svc_registry_cfg_failed_total
|
Total number of failed attempts to connect to the configuration service |
nds_staged_messages_count
|
Total number of unprocessed NDS notification messages that are staged on the storage |
nds_endpoint_process_count
|
Number of NDS notification messages that are being simultaneously processed on the endpoint |
instance_vol_svc:ostor_nds_total:rate5m
|
Number of NDS notification messages per second by a particular NDS service for 5 minutes |
instance_vol_svc:ostor_nds_repeat_total:rate5m
|
Number of repeated NDS notification messages per second by a particular NDS service for 5 minutes |
instance_vol_svc:ostor_nds_error_total:rate5m
|
Number of all NDS notification processing errors per second by a particular NDS service for 5 minutes |
instance_vol_svc:ostor_nds_delete_error_total:rate5m
|
Number of all NDS notification deletion errors per second by a particular NDS service for 5 minutes |
rpc_errors_total
|
Number of RPC errors reported by the user space part of storage |
fused_kernel_rpc_errors_total
|
Number of RPC errors reported by the kernel part of storage |
Bucket and user size metrics
Metrics that report object storage usage per bucket and per user are not available by default. To collect this statistics, you need to enable it by running the following command on an S3 node:
# ostor-ctl set-vol -V 0100000000000002 --enable-stat
The following metrics will appear in Prometheus:
account_control_buckets_size
: Bucket size, in bytesaccount_control_user_size
: Total size of all user buckets, in bytes