High availability keeps virtual machines operational if the node they are located on fails due to a kernel crash, power outage, or becomes unreachable over the network. Graceful shutdown is not considered a failure event.
In the event of failure, the system will attempt to evacuate the affected VMs automatically, that is, migrate them offline with auto-scheduling to other healthy compute nodes in the following order:
- VMs with the “Active” status are evacuated first and automatically started.
- VMs with the “Shut down” status are evacuated next and remain stopped.
- All other VMs are ignored and left on the failed node.
If something blocks the evacuation, for example, the destination compute nodes lack the resources to host the affected VMs, these VMs remain on the failed node and receive the “Error” status. You can evacuate them manually after solving the issue (providing sufficient resources, joining new nodes to the cluster, etc.).
By default, high availability for virtual machines is enabled automatically after creating the compute cluster. If required, you can disable it manually. Keep in mind that virtual machines with disabled high availability will not be evacuated to healthy nodes in the case of a failover.
- The compute cluster can survive the failure of only one node.
- Virtual machines are created, as described in Creating virtual machines.
To disable high availability for virtual machines
To evacuate virtual machines manually