10.25. Monitoring Nodes and Virtual Environments via Zabbix

Note

The collected statistics are only intended for monitoring and are not suitable for billing purposes.

You can monitor nodes with Virtuozzo Hybrid Server 7.5 and newer as well as running VEs on them via Zabbix.

This guide describes how to install the Zabbix agent on a Virtuozzo Hybrid Server node and configure an existing Zabbix server. For details on installing Zabbix, see its documentation.

10.25.1. Installing the Zabbix Agent

Perform these steps on each Virtuozzo Hybrid Server 7.5 node that you want to monitor.

  1. Install the Zabbix repository. For example:

    # rpm -Uvh https://repo.zabbix.com/zabbix/5.0/rhel/7/x86_64/\
    zabbix-release-5.0-1.el7.noarch.rpm
    # yum clean all
    
  2. Install the agent package:

    # yum install zabbix-agent
    
  3. Configure the Zabbix agent by editing /etc/zabbix/zabbix_agentd.conf as follows:

    Server=<zabbix_server_IP>
    ListenIP=<listen_IP>
    ServerActive=<zabbix_server_IP>
    Hostname=<hostname>
    Timeout=30
    AllowRoot=1
    

    Where:

    • Server and ServerActive are the Zabbix server IP address.

    • ListenIP is a node’s IP address that will listen for incoming connections from the Zabbix server.

    • Hostname is a unique name that must match the name you will use to register the host in the Zabbix web panel. It is recommended to use node’s actual hostname.

    • Timeout is increased to allow a heavily loaded node enough time to send the statistics data to the Zabbix server.

    • AllowRoot=1 is required to let the user zabbix collect statistics from the node.

  4. Add the user zabbix to sudoers. For example:

    # visudo
    zabbix  ALL=(ALL)   NOPASSWD: ALL
    
  5. Configure node’s firewall:

    # firewall-cmd --zone=public --permanent --add-port=10050-10051/tcp
    # firewall-cmd --reload
    
  6. Install the Virtuozzo module and template:

    # yum install vz-zabbix-agent
    

    The package will supply the following Zabbix XML template:

    # rpm -ql vz-zabbix-agent | grep xml
    /etc/zabbix/templates/zbx_virtuozzo_template.xml
    

    You will need to import it to the Zabbix web panel later.

  7. Enable and start the Zabbix agent:

    # systemctl enable zabbix-agent
    # systemctl start zabbix-agent
    

10.25.2. Configuring Zabbix Server

To see the data collected from the nodes, do the following in the Zabbix web panel:

  1. Navigate to Configuration -> Templates. Click Import and import zbx_virtuozzo_template.xml obtained in the previous section.

  2. Navigate to Configuration -> Hosts and click Create Host. In Host name enter the same name as specified in Hostname in zabbix_agentd.conf. Next, select a Group and provide agent network interface details in Interfaces. Specify other details if needed and click Add.

  3. Navigate to Configuration -> Hosts -> <host> -> Templates. Find Template Virtuozzo in Link new templates and click Update.

    If you also want to use S.M.A.R.T. triggers provided by Virtuozzo, instead of or together with the standard ones, link Template Virtuozzo SMART as well.

The statistics for the host will become available in the Zabbix web panel in a short while.

Optionally, you can filter out unnecessary metrics by excluding ploop mounts from mounted file system discovery. Zabbix performs it every hour by default in its Linux server template. Moreover, you can safely filter out vme network interfaces of virtual machines. Do the following:

  1. Navigate to Configuration -> Templates -> Template OS Linux by Zabbix agent -> Macros -> Inherited and template macros.

  2. Locate the macro {$VFS.FS.FSNAME.NOT_MATCHES}. Add /vz/root/| to its start. For example:

    ^(/vz/root/|/dev|/sys|/run|/proc|.+/shm$)
    

    Newly discovered ploops will be filtered out from now on. The statistics for the already discovered ploops will be dropped when the grace period is over. The grace period is 30 days by default. You can change it in the file system discovery rule.

  3. Locate the macro {$NET.IF.IFNAME.NOT_MATCHES} and add ^vme[0-9a-z]+| to its start. For example:

    (^vme[0-9a-z]+|$^Software Loopback Interface|^NULL[0-9.]*$|^[Ll]o[0-9.]*$|
    ^[Ss]ystem$|^Nu[0-9.]*$|^veth[0-9a-z]+$|docker[0-9]+|br-[a-z0-9]{12})
    

    Now newly discovered vme interfaces will be filtered out as well.

10.25.3. Supported Triggers

The following triggers (i.e. alerts) are supported for Virtuozzo Hybrid Server.

Trigger

Priority

Related metric

What to do

License is not active.

Disaster

virtuozzo.host.license_status

Check and update the license.

A S.M.A.R.T. metric is critically low. The disk is about to fail.

S.M.A.R.T. metrics watched:

  • Command_Timeout

  • Current_Pending_Sector

  • End-To-End_Error

  • Offline_Uncorrectable

  • Reallocated_Event_Count

  • Reallocated_Sector_Ct

  • Spin_Retry_Count

Disaster

virtuozzo.smart.discovery

The node’s disk is about to fail. Replace it as soon as possible.

A S.M.A.R.T. metric is greater than zero. The disk health deteriorates.

S.M.A.R.T. metrics watched:

  • Current_Pending_Sector

  • Offline_Uncorrectable

  • Reallocated_Event_Count

  • Reallocated_Sector_Ct

Warning

Inspect the health of node’s disk. You may need to replace it soon.

Virtual machine’s or container’s memory usage has been over 95% for the last 1 hour.

Warning

virtuozzo.ct.memory_used virtuozzo.vm.memory_used

Check the virtual environment for problems, including software issues or malware.

Guest OS crashed in a virtual machine.

High

virtuozzo.vm.status

Find out the reasons for the crash, fix the virtual machine, and restart any services that will not do so automatically.

Virtual machine’s balloon driver is not responding. The VM has stopped reporting its memory usage statistics and is not releasing node’s memory automatically.

Info

virtuozzo.vm.memory- _upd_timestamp

Find out what has happened to the virtio_balloon kernel module inside the VM. Reload the module.

10.25.4. Managing Disk I/O Parameters

Note that I/O and IOPS limiting only works for supported I/O schedulers. The scheduler type is stored in /sys/block/<device>/queue/scheduler. For example:

# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]

The I/O scheduler used is marked by square brackets. In this example, it is cfq.

The following I/O schedulers are supported:

  • CFQ. A separate block device line with counters is added to the iostat file. For example:

    # cat /proc/bc/100/iostat
    flush 100 . 0 0 0 0 0 7389 1893968 0 0
    fuse 100 . 0 0 0 0 0 0 0 0 0
    sda 100 . 0 0 0 9000 1843380 245216 55845488 245028 188
    
  • Deadline. The counters are added to the values in the flush line.

I/O and IOPS limiting does not work for devices with the noop I/O scheduler or without one (e.g., logical devices, CEPH RBD devices, and such).