Disk requirements

Disk types and roles

  • Using SATA HDDs with one SSD for caching is more cost effective than using only SAS HDDs without such an SSD.
  • Using NVMe or SAS SSDs for write caching improves random I/O performance and is highly recommended for all workloads with heavy random access (for example, iSCSI volumes). In turn, SATA disks are best suited for SSD-only configurations but not write caching.
  • Running metadata services on SSDs improves cluster performance. To also minimize CAPEX, the same SSDs can be used for write caching.
  • Using shingled magnetic recording (SMR) HDDs is available only for storage purposes and only if the node has an SSD disk for cache.
  • If capacity is the main goal and you need to store infrequently accessed data, select SATA disks over SAS ones. If performance is the main goal, select NVMe or SAS disks over SATA ones.
  • Disk block size (for example, 512b or 4K) is not important and has no effect on performance.
  • The maximum supported physical partition size is 254 TiB.

Disk capacity

  • The system disk must have at least 100 GB of space.
  • It is possible to use disks of different size in the same cluster. However, keep in mind that, given the same IOPS, smaller disks will offer higher performance per terabyte of data compared to bigger disks. It is recommended to group disks with the same IOPS per terabyte in the same tier.
  • The capacity of HDD and SSD is measured and specified with decimal, not binary prefixes, so “TB” in disk specifications usually means “terabyte.” The operating system, however, displays a drive capacity using binary prefixes meaning that “TB” is “tebibyte” which is a noticeably larger number. As a result, disks may show a capacity smaller than the one marketed by the vendor. For example, a disk with 6 TB in specifications may be shown to have 5.45 TB of actual disk space in Virtuozzo Hybrid Infrastructure. 5 percent of disk space is reserved for emergency needs. Therefore, if you add a 6 TB disk to a cluster, the available physical space should increase by about 5.2 TB.
  • Performance of SSD disks may depend on their size. Lower-capacity drives (100 to 400 GB) may perform much slower (sometimes up to ten times slower) than higher-capacity ones (1.9 to 3.8 TB). Check the drive performance and endurance specifications before purchasing hardware.
  • Thin provisioning is always enabled for all data and cannot be configured otherwise.

Consumer-grade SSD drives

  • Consumer-grade SSD drives can withstand a very low number of rewrites. SSD drives intended for storage clusters must offer at least 1 DWPD endurance (10 DWPD is recommended). The higher the endurance, the less often SSDs will need to be replaced, and this will improve TCO.
  • Consumer-grade SSD drives usually have unstable performance and are not suited to withstand sustainable enterprise workloads. For this reason, pay attention to sustainable load tests when choosing SSDs.
  • Many consumer-grade SSD drives can ignore disk flushes and falsely report to operating systems that data was written while it, in fact, was not. Examples of such drives include OCZ Vertex 3, Intel 520, Intel X25-E, and Intel X-25-M G2. These drives are known to be unsafe in terms of data commits, they should not be used with databases, and they may easily corrupt the file system in case of a power failure. It is recommended to use enterprise-grade SSD drives with power loss protection, as described in Protecting data during a power outage.

RAID and HBA controllers

  • Create hardware or software RAID1 volumes for system disks by using RAID or HBA controllers, respectively, to ensure its high performance and availability.
  • Use HBA controllers, as they are less expensive and easier to manage than RAID controllers.
  • Disable all RAID controller caches for SSD drives. Modern SSDs have good performance that can be reduced by a RAID controller’s write and read cache. It is recommended to disable caching for SSD drives and leave it enabled only for HDD drives.
  • If you use RAID controllers, do not create RAID volumes from HDDs intended for storage. Each storage HDD needs to be recognized by Virtuozzo Hybrid Infrastructure as a separate device.
  • If you use RAID controllers with caching, equip them with backup battery units (BBUs), to protect against cache loss during power outages.