6.16. Improving Uptime with Virtual Machine High Availability

6.16.1. What is High Availability?

High availability is recommended for virtual machines running critical workloads. A highly available virtual machine is automatically restarted, either on its original host or another host in the cluster, if its process is interrupted, such as in the following scenarios:

  • A host becomes non-operational due to hardware failure.
  • A host is put into maintenance mode for scheduled downtime.
  • A host becomes unavailable because it has lost communication with an external storage resource.

A highly available virtual machine is not restarted if it is shut down cleanly, such as in the following scenarios:

  • The virtual machine is shut down from within the guest.
  • The virtual machine is shut down from the Manager.
  • The host is shut down by an administrator without being put in maintenance mode first.

With storage domains V4 or later, virtual machines have the additional capability to acquire a lease on a special volume on the storage, enabling a virtual machine to start on another host even if the original host loses power. The functionality also prevents the virtual machine from being started on two different hosts, which may lead to corruption of the virtual machine disks.

With high availability, interruption to service is minimal because virtual machines are restarted within seconds with no user intervention required. High availability keeps your resources balanced by restarting guests on a host with low current resource utilization, or based on any workload balancing or power saving policies that you configure. This ensures that there is sufficient capacity to restart virtual machines at all times.

High Availability and Storage I/O Errors

If a storage I/O error occurs, the virtual machine is paused. You can define how the host handles highly available virtual machines after the connection with the storage domain is reestablished; they can either be resumed, ungracefully shut down, or remain paused. For more information about these options, see Section A.1.6, “Virtual Machine High Availability Settings Explained”.

6.16.2. High Availability Considerations

A highly available host requires a power management device and fencing parameters. In addition, for a virtual machine to be highly available when its host becomes non-operational, it needs to be started on another available host in the cluster. To enable the migration of highly available virtual machines:

  • Power management must be configured for the hosts running the highly available virtual machines.
  • The host running the highly available virtual machine must be part of a cluster which has other available hosts.
  • The destination host must be running.
  • The source and destination host must have access to the data domain on which the virtual machine resides.
  • The source and destination host must have access to the same virtual networks and VLANs.
  • There must be enough CPUs on the destination host that are not in use to support the virtual machine’s requirements.
  • There must be enough RAM on the destination host that is not in use to support the virtual machine’s requirements.

6.16.3. Configuring a Highly Available Virtual Machine

High availability must be configured individually for each virtual machine.

Configuring a Highly Available Virtual Machine

  1. Click ComputeVirtual Machines and select a virtual machine.
  2. Click Edit.
  3. Click the High Availability tab.
  4. Select the Highly Available check box to enable high availability for the virtual machine.
  5. Select the storage domain to hold the virtual machine lease, or select No VM Lease to disable the functionality, from the Target Storage Domain for VM Lease drop-down list. See Section 6.16.1, “What is High Availability?” for more information about virtual machine leases.

    Important

    This functionality is only available on storage domains that are V4 or later.

  6. Select AUTO_RESUME, LEAVE_PAUSED, or KILL from the Resume Behavior drop-down list. If you defined a virtual machine lease, KILL is the only option available. For more information see Section A.1.6, “Virtual Machine High Availability Settings Explained”.
  7. Select Low, Medium, or High from the Priority drop-down list. When migration is triggered, a queue is created in which the high priority virtual machines are migrated first. If a cluster is running low on resources, only the high priority virtual machines are migrated.
  8. Click OK.