2.1. Storage Architecture Overview¶
The fundamental component of Virtuozzo Infrastructure Platform is a storage cluster: a group of physical servers interconnected by network. Each server in a cluster is assigned one or more roles and typically runs services that correspond to these roles:
- storage role: chunk service or CS
- metadata role: metadata service or MDS
- supplementary roles:
- SSD cache,
Any server in the cluster can be assigned a combination of storage, metadata, and network roles. For example, a single server can be an S3 access point, an iSCSI access point, and a storage node at once.
Each cluster also requires that a web-based admin panel be installed on one (and only one) of the nodes. The panel enables administrators to manage the cluster.
2.1.1. Storage Role¶
Storage nodes run chunk services, store all the data in the form of fixed-size chunks, and provide access to these chunks. All data chunks are replicated and the replicas are kept on different storage nodes to achieve high availability of data. If one of the storage nodes fails, remaining healthy storage nodes continue providing the data chunks that were stored on the failed node.
Only a server with disks of certain capacity can be assigned the storage role.
2.1.2. Metadata Role¶
Metadata nodes run metadata services, store cluster metadata, and control how user files are split into chunks and where these chunks are located. Metadata nodes also ensure that chunks have the required amount of replicas and log all important events that happen in the cluster.
To provide system reliability, Virtuozzo Infrastructure Platform uses the Paxos consensus algorithm. It guarantees fault-tolerance if the majority of nodes running metadata services are healthy.
To ensure high availability of metadata in a production environment, at least three nodes in a cluster must be running metadata services. In this case, if one metadata service fails, the remaining two will still be controlling the cluster. However, it is recommended to have at least five metadata services to ensure that the cluster can survive simultaneous failure of two nodes without data loss.
2.1.3. Supplementary Roles¶
- SSD cache
- Boosts chunk read/write performance by creating write caches on selected solid-state drives (SSDs). It is recommended to also use such SSDs for metadata, see Metadata Role. The use of write journals may speed up write operations in the cluster by two and more times.
- One disk per node that is reserved for the operating system and unavailable for data storage.