1.3. Planning Infrastructure for Virtuozzo Storage with CLI Management

To plan your infrastructure for Virtuozzo Storage managed by command-line tools, you will need to decide on the hardware configuration of each storage node and plan the storage networks.

Information in this section is meant to help you complete these tasks.

1.3.1. Understanding Virtuozzo Storage Architecture

Before starting the deployment process, you should have a clear idea of the Virtuozzo Storage infrastructure. A typical Virtuozzo Storage is shown below.

../_images/image003.png

The basic component of Virtuozzo Storage is a cluster. The cluster is a group of physical computers connected to the same Ethernet network and performing the following roles:

  • chunk servers (CS),
  • metadata servers (MDS),
  • client computers (or clients).

All data in a Virtuozzo Storage cluster, including virtual machine and container disk images, is stored in the form of fixed-size chunks on chunk servers. The cluster automatically replicates the chunks and distributes them across the available chunk servers to provide high availability of data.

To keep track of data chunks and their replicas, the cluster stores metadata about them on metadata (MDS) servers. The central MDS server, called the master MDS server, monitors all cluster activity and keeps metadata current.

Clients manipulate data stored in the cluster by sending different types of file requests, such as modifying an existing file or creating a new one.

  • Chunk servers (CS). Chunk servers store all the data, including the contents of virtual machines and containers, in the form of fixed-size chunks and provide access to these chunks. All data chunks are replicated and the replicas are kept on different chunk servers to achieve high availability. If one of the chunk servers goes down, the other chunk servers will continue providing the data chunks that were stored on the failed server.

  • Metadata servers (MDS). Metadata servers store metadata about chunk servers and control how files keeping the contents of virtual machines and containers are split into chunks and where these chunks are located. MDS servers also ensure that a cluster has enough chunk replicas and store a global log of all important events that happen in the cluster.

    To provide high availability for a Virtuozzo Storage cluster, you need to set up several MDS servers in the cluster. In this case, if one MDS server goes offline, another MDS server will continue keeping control over the cluster.

    Note

    MDS servers deal with processing metadata only and do not normally participate in any read/write operations related to data chunks.

  • Clients. Clients are computers with Virtuozzo 7 from where you run virtual machines and containers stored in a Virtuozzo Storage cluster.

Note

  1. You can set up any computer in the cluster to perform the role of a metadata server, chunk server, or client. You can also assign two or all three roles to one and the same computer. For example, you can configure a computer to act as a client by installing Virtuozzo 7 on it and running virtual machines and containers from the computer. At the same time, if you want this computer to allocate its local disk space to the cluster, you can set it up as a chunk server.
  2. Though Virtuozzo Storage can be mounted as a file system, it is not a POSIX-compliant file system and lacks some POSIX features like ACL, user and group credentials, hardlinks, and some other.

1.3.2. Virtuozzo Storage Configurations

This section provides information on two Virtuozzo Storage configurations:

  • Minimum Configuration. You can create the minimum configuration for evaluating the Virtuozzo Storage functionality. This configuration, however, is not recommended for use in production environments.
  • Recommended Configuration. You can use the recommended Virtuozzo Storage configuration in a production environment “as is” or adapt it to your needs.

1.3.2.1. Minimum Configuration

The minimum hardware configuration for deploying a Virtuozzo Storage cluster is given below:

Server Role Number of Servers
Metadata Server 1 (can be shared with chunk servers and clients)
Chunk Server 1 (can be shared with metadata servers and clients)
Client 1 (can be shared with chunk and metadata servers)
Total number of servers with role sharing: 1 without role sharing: 3

Graphically, the minimum configuration can be represented as follows:

../_images/image001.png

For a Virtuozzo Storage cluster to function, it must have at least one MDS server, one chunk server, and one client. The minimum configuration has two main limitations:

  1. The cluster has one metadata server, which presents a single point of failure. If the metadata server fails, the entire Virtuozzo Storage cluster will become non-operational.
  2. The cluster has one chunk server that can store only one chunk replica. If the chunk server fails, the cluster will suspend all operations with chunks until a new chunk server is added to the cluster.

1.3.3. Hardware Requirements

Before setting up a Virtuozzo Storage cluster, make sure you have all the necessary equipment at hand.

You are also recommended to:

General

  • Each service (be it MDS, CS or client) requires 1.5 GB of free space on root partition for logs. For example, to run 1 metadata server, 1 client, and 12 chunk servers on a host, you will need 21 GB of free space on the root partition.

Metadata Servers

  • 1 CPU core,
  • 1 GB of RAM per 100 TB of data in the cluster,
  • 3 GB of disk space per 100 TB of data in the cluster,
  • 1 or more Ethernet adapters (1 Gbit/s or faster).

Note

It is recommended to place the MDS journal on SSD, either dedicated or shared with CS and client caches, or at least on a dedicated HDD which has no CS services on it.

Chunk Servers

  • 1/8 of a CPU core (e.g., 1 CPU core per 8 CS),
  • 1 GB of RAM,
  • 100 GB or more of free disk space,
  • 1 or more Ethernet adapters (1 Gbit/s or faster).

Note

  1. On using local RAID with Virtuozzo Storage, consult the Virtuozzo Storage Administrator’s Command Line Guide.
  2. Using a shared JBOD array across multiple nodes running CS services may introduce a single point of failure and make the cluster unavailable if all data replicas happen to be allocated and stored on the failed JBOD. For more information, see here.
  3. For large clusters, it is critically important to configure proper failure domains to improve data availability. For more information, see here.
  4. Do not place chunk servers on disks already used in other I/O workloads, e.g., system or swap. Sharing disks between CS and other sources of I/O will result in severe performance loss and high I/O latencies.

Clients

  • 1 CPU core per 30,000 IOPS,
  • 1 GB of RAM,
  • 1 or more Ethernet adapters (1 Gbit/s or faster).

The following table lists the maximum network performance a Virtuozzo Storage client can get with the specified network interface. The recommendation for clients is to use 10Gbps network hardware between any two cluster nodes and minimize network latencies, especially if SSD disks are used.

Storage network interface 1Gbps 2 x 1Gbps 3 x 1Gbps 10Gbps 2 x 10Gbps
Entire node maximum I/O throughput 100MB/s ~175MB/s ~250MB/s 1GB/s 1.75GB/s
Single VM maximum I/O throughput (replication) 100MB/s 100MB/s 100MB/s 1GB/s 1GB/s
Single VM maximum I/O throughput (erasure coding) 70MB/s ~130MB/s ~180MB/s 700MB/s 1.3GB/s

Note

For hard disk requirements and the recommended partitioning scheme for servers that run Virtuozzo and participate in clusters, see the Partitioning the Hard Drives.

1.3.4. Network Requirements

When planning your network, make sure that it

  • operates at 1 Gbit/s or faster (for more details, see here),
  • has non-blocking Ethernet switches.

You should use separate networks and Ethernet adapters for user and cluster traffic. This will prevent possible I/O performance degradation in your cluster by external traffic. Besides, if a cluster is accessible from public networks (e.g., from the Internet), it may become a target of Denial-of-Service attacks, and the entire cluster I/O subsystem may get stuck.

The figure below shows a sample network configuration for Virtuozzo Storage.

../_images/image002.png

In this network configuration:

  • BackNet is a private network used solely for interconnection and intercommunication of servers in the cluster and is not available from the public network. All servers in the cluster are connected to this network via one of their network cards.
  • FrontNet is a public network customers use to access their virtual machines and containers in the Virtuozzo Storage cluster.

Note

  1. Network switches are a very common point of failure, so it is critically important to configure proper failure domains to improve data availability. For more information, see here.
  2. To learn more about Virtuozzo Storage networks (in particular, how to bind chunk servers to specific IP addresses), see here.