Redundancy by replication

With replication, Virtuozzo Hybrid Infrastructure breaks the incoming data stream into 256 MB chunks. Each chunk is replicated and replicas are stored on different failure domains, so that each failure domain has only one replica of a given chunk.

The following diagram illustrates the 2 replicas redundancy mode with the host failure domain.

Replication in Virtuozzo Hybrid Infrastructure is similar to the RAID rebuild process, but has two key differences:

  • Replication in Virtuozzo Hybrid Infrastructure is much faster than that of a typical online RAID 1/5/10 rebuild. The reason is that Virtuozzo Hybrid Infrastructure replicates chunks in parallel, to multiple failure domains.
  • The more storage nodes are in a cluster, the faster the cluster will recover from a disk or node failure.

High replication performance minimizes the periods of reduced redundancy for the cluster. Replication performance is affected by:

  • The number of available storage nodes. As the replication runs in parallel, the more available replication sources and destinations there are, the faster it is.
  • Performance of storage node disks.
  • Network performance. All replicas are transferred between failure domains over network. For example, 1 Gbps throughput can be a bottleneck (refer to Network requirements and recommendations).
  • Distribution of data in the cluster. Some storage nodes may have much more data to replicate than others and may become overloaded during replication.
  • I/O activity in the cluster during replication.