1.3. Planning Infrastructure for Virtuozzo Storage with CLI Management¶
To plan your infrastructure for Virtuozzo Storage managed by command-line tools, you will need to decide on the hardware configuration of each storage node and plan the storage networks.
Information in this section is meant to help you complete these tasks.
1.3.1. Understanding Virtuozzo Storage Architecture¶
Before starting the deployment process, you should have a clear idea of the Virtuozzo Storage infrastructure. A typical Virtuozzo Storage is shown below.
The basic component of Virtuozzo Storage is a cluster. The cluster is a group of physical computers connected to the same Ethernet network and performing the following roles:
Chunk servers (CS)
Metadata servers (MDS)
Client computers (or clients)
All data in a Virtuozzo Storage cluster, including virtual machine and container disk images, is stored in the form of fixed-size chunks on chunk servers. The cluster automatically replicates the chunks and distributes them across the available chunk servers to provide high availability of data.
To keep track of data chunks and their replicas, the cluster stores metadata about them on metadata (MDS) servers. The central MDS server, called the master MDS server, monitors all cluster activity and keeps metadata current.
Clients manipulate data stored in the cluster by sending different types of file requests, such as modifying an existing file or creating a new one.
Chunk servers (CS). Chunk servers store all the data, including the contents of virtual machines and containers, in the form of fixed-size chunks and provide access to these chunks. All data chunks are replicated and the replicas are kept on different chunk servers to achieve high availability. If one of the chunk servers goes down, the other chunk servers will continue providing the data chunks that were stored on the failed server.
Metadata servers (MDS). Metadata servers store metadata about chunk servers and control how files keeping the contents of virtual machines and containers are split into chunks and where these chunks are located. MDS servers also ensure that a cluster has enough chunk replicas and store a global log of all important events that happen in the cluster.
To provide high availability for a Virtuozzo Storage cluster, you need to set up several MDS servers in the cluster. In this case, if one MDS server goes offline, another MDS server will continue keeping control over the cluster.
Note
MDS servers deal with processing metadata only and do not normally participate in any read/write operations related to data chunks.
Clients. Clients are computers with Virtuozzo Hybrid Server 7 from where you run virtual machines and containers stored in a Virtuozzo Storage cluster.
Note the following:
You can set up any computer in the cluster to perform the role of a metadata server, chunk server, or client. You can also assign two or all three roles to one and the same computer. For example, you can configure a computer to act as a client by installing Virtuozzo Hybrid Server 7 on it and running virtual machines and containers from the computer. At the same time, if you want this computer to allocate its local disk space to the cluster, you can set it up as a chunk server.
Though Virtuozzo Storage can be mounted as a file system, it is not a POSIX-compliant file system and lacks some POSIX features like ACL, user and group credentials, hardlinks, and some other.
1.3.2. Virtuozzo Storage Configurations¶
This section provides information on two Virtuozzo Storage configurations:
Minimum Configuration. You can create the minimum configuration for evaluating the Virtuozzo Storage functionality. This configuration, however, is not recommended for use in production environments.
Recommended Configuration. You can use the recommended Virtuozzo Storage configuration in a production environment “as is” or adapt it to your needs.
1.3.2.1. Minimum Configuration¶
The minimum hardware configuration for deploying a Virtuozzo Storage cluster is given below:
Server Role |
Number of Servers |
---|---|
Metadata Server |
1 (can be shared with chunk servers and clients) |
Chunk Server |
1 (can be shared with metadata servers and clients) |
Client |
1 (can be shared with chunk and metadata servers) |
Total number of servers |
with role sharing: 1 without role sharing: 3 |
Graphically, the minimum configuration can be represented as follows:
For a Virtuozzo Storage cluster to function, it must have at least one MDS server, one chunk server, and one client. The minimum configuration has two main limitations:
The cluster has one metadata server, which presents a single point of failure. If the metadata server fails, the entire Virtuozzo Storage cluster will become non-operational.
The cluster has one chunk server that can store only one chunk replica. If the chunk server fails, the cluster will suspend all operations with chunks until a new chunk server is added to the cluster.
1.3.2.2. Recommended Configuration¶
The table below lists two of the recommended configurations for deploying Virtuozzo Storage clusters.
Metadata Server |
Chunk Server |
Total Number of Servers |
---|---|---|
3 (can be shared with chunk servers and clients) |
5-9 (can be shared with metadata servers and clients) |
5 or more (depending on the number of clients and chunk servers and whether they share roles) |
5 (can be shared with chunk servers and clients) |
10 or more (can be shared with metadata servers and clients) |
5 or more (depending on the number of clients and chunk servers and whether they share roles) |
Clients |
1 or more You can include any number of clients in the cluster. For example, if you have 5 servers with Virtuozzo Hybrid Server, you can configure them all to act as clients. You can share servers acting as clients with chunk and metadata servers. For example, you can have 5 physical servers and configure each of them to simultaneously act as an MDS server, a chunk server, and a client. |
Even though new clusters are configured to have 1 replica for each data chunk by default, you need to configure each data chunk to have at least 3 replicas to provide high availability for your data.
In total, at least 9 machines running Virtuozzo Storage are recommended per cluster. Smaller clusters will work as fine but will not provide the significant performance advantages over direct-attached storage (DAS) or improved recovery times.
Note the following:
For large clusters, it is critically important to configure proper failure domains to improve data availability. For more information, see Configuring Failure Domains.
In small and medium clusters, MDS servers consume little resources and do not require being set up on dedicated Hardware Nodes.
A small cluster is 3 to 5 machines, a medium cluster is 6 to 15-20 machines, and a large cluster is 15-20 machines and more.
Time should be synchronized on all servers in the cluster via NTP. Doing so will make it easier for the support department to understand cluster logs (migrations, failovers, etc.). By default, Virtuozzo Hybrid Server 7 uses the
chronyd
service for time synchronization. If you want to usentpdate
orntpd
, stop and disablechronyd
first.
1.3.3. Hardware Requirements¶
Before setting up a Virtuozzo Storage cluster, make sure you have all the necessary equipment at hand.
You are also recommended to:
Consult Using SSD Drives to learn how you can increase cluster performance by using solid-state drives for write journaling and data caching, and how many SSDs you may need depending on the number of HDDs in your cluster.
Check Appendix A - Troubleshooting for common hardware issues and misconfigurations that may affect your cluster performance and lead to data inconsistency and corruption.
General
Each service (be it MDS, CS or client) requires 1.5 GB of free space on root partition for logs. For example, to run 1 metadata server, 1 client, and 12 chunk servers on a host, you will need 21 GB of free space on the root partition.
Metadata Servers
1 CPU core
1 GB of RAM per 100 TB of data in the cluster
3 GB of disk space per 100 TB of data in the cluster
1 or more Ethernet adapters (1 Gbit/s or faster)
Note
It is recommended to place the MDS journal on SSD, either dedicated or shared with CS and client caches, or at least on a dedicated HDD which has no CS services on it.
Chunk Servers
1/8 of a CPU core (e.g., 1 CPU core per 8 CS)
1 GB of RAM
100 GB or more of free disk space
1 or more Ethernet adapters (1 Gbit/s or faster)
Note the following:
On using local RAID with Virtuozzo Storage, consult Exploring Possible Disk Drive Configurations.
Using a shared JBOD array across multiple nodes running CS services may introduce a single point of failure and make the cluster unavailable if all data replicas happen to be allocated and stored on the failed JBOD. For more information, see Configuring Failure Domains.
For large clusters, it is critically important to configure proper failure domains to improve data availability. For more information, see Configuring Failure Domains.
Do not place chunk servers on disks already used in other I/O workloads, e.g., system or swap. Sharing disks between CS and other sources of I/O will result in severe performance loss and high I/O latencies.
Clients
1 CPU core per 30,000 IOPS
1 GB of RAM
1 or more Ethernet adapters (1 Gbit/s or faster)
The following table lists the maximum network performance a Virtuozzo Storage client can get with the specified network interface. The recommendation for clients is to use 10Gbps network hardware between any two cluster nodes and minimize network latencies, especially if SSD disks are used.
Storage network interface | 1Gbps | 2 x 1Gbps | 3 x 1Gbps | 10Gbps | 2 x 10Gbps |
---|---|---|---|---|---|
Entire node maximum I/O throughput | 100MB/s | ~175MB/s | ~250MB/s | 1GB/s | 1.75GB/s |
Single VM maximum I/O throughput (replication) | 100MB/s | 100MB/s | 100MB/s | 1GB/s | 1GB/s |
Single VM maximum I/O throughput (erasure coding) | 70MB/s | ~130MB/s | ~180MB/s | 700MB/s | 1.3GB/s |
Note
For hard disk requirements and the recommended partitioning scheme for servers that run Virtuozzo Hybrid Server and participate in clusters, see the Partitioning the Hard Drives.
1.3.4. Network Requirements¶
When planning your network, make sure that it
Operates at 1 Gbit/s or faster (for more details, see Using 1 GbE and 10 GbE Networks)
Has non-blocking Ethernet switches
You should use separate networks and Ethernet adapters for user and cluster traffic. This will prevent possible I/O performance degradation in your cluster by external traffic. Besides, if a cluster is accessible from public networks (e.g., from the Internet), it may become a target of Denial-of-Service attacks, and the entire cluster I/O subsystem may get stuck.
The figure below shows a sample network configuration for Virtuozzo Storage.
In this network configuration:
BackNet is a private network used solely for interconnection and intercommunication of servers in the cluster and is not available from the public network. All servers in the cluster are connected to this network via one of their network cards.
FrontNet is a public network customers use to access their virtual machines and containers in the Virtuozzo Storage cluster.
Note the following:
Network switches are a very common point of failure, so it is critically important to configure proper failure domains to improve data availability. For more information, see Configuring Failure Domains.
To learn more about Virtuozzo Storage networks (in particular, how to bind chunk servers to specific IP addresses), see Securing Server Communication in the Cluster.