Redundancy by replication

With replication, Virtuozzo Hybrid Infrastructure breaks the incoming data stream into 256 MB chunks. Each chunk is replicated and replicas are stored on different storage nodes, so that each node has only one replica of a given chunk.

The following diagram illustrates the 2 replicas redundancy mode.

Replication in Virtuozzo Hybrid Infrastructure is similar to the RAID rebuild process, but has two key differences:

  • Replication in Virtuozzo Hybrid Infrastructure is much faster than that of a typical online RAID 1/5/10 rebuild. The reason is that Virtuozzo Hybrid Infrastructure replicates chunks in parallel, to multiple storage nodes.
  • The more storage nodes are in a cluster, the faster the cluster will recover from a disk or node failure.

High replication performance minimizes the periods of reduced redundancy for the cluster. Replication performance is affected by:

  • The number of available storage nodes. As the replication runs in parallel, the more available replication sources and destinations there are, the faster it is.
  • Performance of storage node disks.
  • Network performance. All replicas are transferred between storage nodes over network. For example, 1 Gbps throughput can be a bottleneck (refer to Per-node network requirements and recommendations).
  • Distribution of data in the cluster. Some storage nodes may have much more data to replicate than others and may become overloaded during replication.
  • I/O activity in the cluster during replication.