3.6. Managing Containers Memory Parameters

This section describes the VSwap memory management system. You will learn to do the following:

  • Configure the main VSwap parameters for containers.
  • Set the memory allocation limit in containers.
  • Configure OOM killer behavior.
  • Enhance the VSwap functionality.

3.6.1. Configuring Main VSwap Parameters

Virtuozzo utilizes the VSwap scheme for managing memory-related parameters in containers. Like many other memory management schemes used on standalone Linux computers, this scheme is based on two main parameters:

  • RAM determines the total size of RAM that can be used by the processes of a container.
  • swap determines the total size of swap that can be used by a container for swapping out memory once the RAM is exceeded.

The memory management scheme works as follows:

  1. You set for a container a certain amount of RAM and swap space that can be used by the processes running in the container.
  2. When the container exceeds the RAM limit set for it, the swapping process starts. The swapping process for containers slightly differs from that on a standalone computer. The container swap file is virtual and, if possible, resides in the Node RAM. In other words, when the swap-out for a container starts and the Node has enough RAM to keep the swap file, the swap file is stored in the Node RAM rather than on the hard drive.
  3. Once the container exceeds its swap limit, the system invokes the OOM Killer for this container.
  4. The OOM Killer chooses one or more processes running in the affected container and forcibly kills them.

By default, any newly created container starts using the new memory management scheme. To find out the amount of RAM and swap space set for a container, you can check the values of the PHYSPAGES and SWAPPAGES parameters in the container configuration file, for example:

# grep PHYSPAGES /etc/vz/conf/26bc47f6-353f-444b-bc35-b634a88dbbcc.conf
PHYSPAGES="65536:65536"
# grep SWAPPAGES /etc/vz/conf/26bc47f6-353f-444b-bc35-b634a88dbbcc.conf
SWAPPAGES="65536"

In this example, the value of the PHYSPAGES parameter for the container MyCT is set to 65536. The PHYSPAGES parameter displays the amount of RAM in 4-KB pages, so the total amount of RAM set for the container MyCT equals to 256 MB. The value of the SWAPPAGES parameter is also set to 256 MB.

To configure the amounts of RAM and swap space for the container MyCT, use the --memsize and --swappages options of the prlctl set command. For example, you can execute the following command to set the amount of RAM and SWAP in the container MyCT to 1 GB and 512 MB, respectively:

# prlctl set MyCT --memsize 1G --swappages 512M

3.6.2. Configuring Container Memory Guarantees

A memory guarantee is a percentage of container’s RAM that said container is guaranteed to have.

Important

The total memory guaranteed to all running virtual environments on the host must not exceed host’s physical RAM size. If starting a virtual environment with a memory guarantee would increase the total memory guarantee on the host beyond host’s physical RAM size, said virtual environment will not start. If setting a memory guarantee for a running virtual environment would increase the total memory guarantee on the host beyond host’s physical RAM size, said memory guarantee will not be set.

For containers, the memory guarantee value is set to 0% by default. To change the default value, use the prlctl set --memguarantee command. For example:

# prlctl set MyCT --memguarantee 80

To revert to the default setting, run

# prlctl set MyCT --memguarantee auto

3.6.3. Configuring Container Memory Allocation Limit

When an application starts in a container, it allocates a certain amount of memory for its needs. Usually, the allocated memory is much more than the application actually requires for its execution. This may lead to a situation when you cannot run an application in the container even if it has enough free memory. To deal with such situations, the VSwap memory management scheme introduces the new vm_overcommit option. Using it, you can configure the amount of memory applications in a container may allocate, irrespective of the amount of RAM and swap space assigned to the container.

The amount of memory that can be allocated by applications of a container is the sum of RAM and swap space set for this container multiplied by a memory overcommit factor. In the default (basic) container configuration file, this factor is set to 1.5. For example, if a container is based on the default configuration file and assigned 1 GB of RAM and 512 MB of swap, the memory allocation limit for the container will be 2304 MB. You can configure this limit and set it, for example, to 3 GB by running this command:

# vzctl set MyCT --vm_overcommit 2 --save

This command uses the factor of 2 to increase the memory allocation limit to 3 GB:

(1 GB of RAM + 512 MB of swap) * 2 = 3 GB

Now applications in the container MyCT can allocate up to 3 GB of memory, if necessary.

3.6.4. Configuring Container OOM Killer Behavior

The OOM killer selects a container process (or processes) to end based on the badness reflected in /proc/<pid>/oom_score. The badness is calculated using process memory, total memory, and badness adjustment, and then clipped to the range from 0 to 1000. Each badness point stands for one thousandth of container memory. The process to be killed is the one with the highest resulting badness.

The OOM killer for container processes can be configured using the /etc/vz/oom-groups.conf file that lists patterns based on which badness adjustment is selected for each running process. Each pattern takes a single line and includes the following columns:

  • <command>, mask for the task command name;
  • <parent>, mask for the parent task name;
  • <oom_uid>, task user identifier (UID) filter:
    • If <oom_uid> is -1, the pattern will be applicable to tasks with any UIDs,
    • If <oom_uid> is 0 or higher, the pattern will be applicable to tasks with UIDs equal to the <oom_uid> value,
    • If <oom_uid> is smaller than -1, the pattern will be applicable to tasks with UIDs smaller than the negative <oom_uid> value);
  • <oom_score_adj> badness adjustment. As with badness itself, each adjustment point stands for one thousandth of total container memory. Negative adjustment values reduce process badness. In an out-of-memory situation, an adjustment will guarantee that the process will be allowed to occupy at least <oom_score_adj> thousandths of container memory while there are other processes with higher badness running in the container.

Note

The <command> and <parent> masks support wildcard suffixes: asterisk matches any suffix. E.g., “foo” matches only “foo”, “foo*” matches “foo” and “foobar”.

For example, the pattern

sshd     init     -500     -100

means that in an out-of-memory situation, sshd, a child of init, will be guaranteed at least 100 thousandths (i.e., 10%) of container memory, if its UID is smaller than -(-500) or just 500, e.g., 499. According to RHEL conventions, UIDs from 1 to 499 are usually reserved for system use, so such delimitation may be useful to prioritize and save system processes.

While calculating the badness of a process, the OOM killer searches /proc/vz/oom_score_adj for a suitable pattern based on masks and task UID filter. The search starts from the first line and ends when the first suitable pattern is found. The corresponding adjustment value is then used to obtain the resulting process badness.

The data from /etc/vz/oom-groups.conf is reset and committed to the kernel on boot. To reset and commit the config file manually, you can use the following command:

# cat /etc/vz/oom-groups.conf > /proc/vz/oom_score_adj

3.6.5. Tuning VSwap

The VSwap management scheme can be extended by using UBC parameters. For example, you can set the numproc parameter to configure the maximal number of processes and threads a container may create or the numfile parameter to specify the number of files that may be opened by all processes in the container.