Chapter 3. Configuring backup and recovery options

This chapter explains how to add disaster recovery capabilities to your Red Hat Hyperconverged Infrastructure for Virtualization deployment so that you can restore your cluster to a working state after a disk or server failure.

3.1. Prerequisites

3.1.1. Prerequisites for geo-replication

Be aware of the following requirements and limitations when configuring geo-replication:

One geo-replicated volume only
Red Hat Hyperconverged Infrastructure for Virtualization (RHHI for Virtualization) supports only one geo-replicated volume. Red Hat recommends backing up the volume that stores the data of your virtual machines, as this usually contains the most valuable data.
Two different managers required
The source and destination volumes for geo-replication must be managed by different instances of Red Hat Virtualization Manager.

3.1.2. Prerequisites for failover and failback configuration

Versions must match between environments
Ensure that the primary and secondary environments have the same version of Red Hat Virtualization Manager, with identical data center compatibility versions, cluster compatibility versions, and PostgreSQL versions.
No virtual machine disks in the hosted engine storage domain
The storage domain used by the hosted engine virtual machine is not failed over, so any virtual machine disks in this storage domain will be lost.
Execute Ansible playbooks manually from a separate master node
Generate and execute Ansible playbooks manually from a separate machine that acts as an Ansible master node.

3.2. Supported backup and recovery configurations

There are two supported ways to add disaster recovery capabilities to your Red Hat Hyperconverged Infrastructure for Virtualization deployment.

Configure backing up to a secondary volume only

Regularly synchronizing your data to a remote secondary volume helps to ensure that your data is not lost in the event of disk or server failure.

This option is suitable if the following statements are true of your deployment.

  • You require only a backup of your data for disaster recovery.
  • You do not require highly available storage.
  • You do not want to maintain a secondary cluster.
  • You are willing to manually restore your data and reconfigure your backup solution after a failure has occurred.

Follow the instructions in Configuring backup to a secondary volume to configure this option.

Configure failing over to and failing back from a secondary cluster

This option provides failover and failback capabilities in addition to backing up data on a remote volume. Configuring failover of your primary cluster’s operations and storage domains to a secondary cluster helps to ensure that your data remains available in event of disk or server failure in the primary cluster.

This option is suitable if the following statements are true of your deployment.

  • You require highly available storage.
  • You are willing to maintain a secondary cluster.
  • You do not want to manually restore your data or reconfigure your backup solution after a failure has occurred.

Follow the instructions in Configuring failover to and failback from a secondary cluster to configure this option.

Red Hat recommends that you configure at least a backup volume for production deployments.

3.3. Configuring backup to a secondary volume

This section covers how to back up a gluster volume to a secondary gluster volume using geo-replication.

To do this, you must:

  1. Ensure that all prerequisites are met.
  2. Create a suitable volume to use as a geo-replication target.
  3. Configure a geo-replication session between the source volume and the target volume.
  4. Schedule the geo-replication process.

3.3.1. Prerequisites

3.3.1.1. Enable shared storage on the source volume

Ensure that the volume you want to back up (the source volume) has shared storage enabled. Run the following command on any server that hosts the source volume to enable shared storage.

# gluster volume set all cluster.enable-shared-storage enable

Ensure that a gluster volume named gluster_shared_storage is created in the source cluster, and is mounted at /var/run/gluster/shared_storage on all the nodes in the source cluster. See Setting Up Shared Storage for further information.

3.3.1.2. Match encryption on source and target volumes

If encryption is enabled on the volume that you want to back up, encryption must also be enabled on the volume that will hold your backed up data.

See Configure Encryption with Transport Layer Security (TLS/SSL) for details.

3.3.2. Create a suitable target volume for geo-replication

Prepare a secondary gluster volume to hold the geo-replicated copy of your source volume. This target volume should be in a separate cluster, hosted at a separate site, so that the risk of source and target volumes being affected by the same outages is minimised.

Ensure that the target volume for geo-replication has sharding enabled. Run the following command on any node that hosts the target volume to enable sharding on that volume.

# gluster volume set <volname> features.shard enable

3.3.3. Configuring geo-replication for backing up volumes

3.3.3.1. Creating a geo-replication session

A geo-replication session is required to replicate data from an active source volume to a passive target volume.

Important

Only rsync based geo-replication is supported with Red Hat Hyperconverged Infrastructure for Virtualization.

  1. Create a common pem pub file.

    Run the following command on a source node that has key-based SSH authentication without a password configured to the target nodes.

    # gluster system:: execute gsec_create
  2. Create the geo-replication session

    Run the following command to create a geo-replication session between the source and target volumes, using the created pem pub file for authentication.

    # gluster volume geo-replication <SOURCE_VOL> <TARGET_NODE>::<TARGET_VOL> create push-pem

    For example, the following command creates a geo-replication session from a source volume prodvol to a target volume called backupvol, which is hosted by backup.example.com.

    # gluster volume geo-replication prodvol backup.example.com::backupvol create push-pem

    By default this command verifies that the target volume is a valid target with available space. You can append the force option to the command to ignore failed verification.

  3. Configure a meta-volume

    This relies on the source volume having shared storage configured, as described in Prerequisites.

    # gluster volume geo-replication <SOURCE_VOL> <TARGET_HOST>::<TARGET_VOL> config use_meta_volume true
Important

Do not start the geo-replication session. Starting the geo-replication session begins replication from your source volume to your target volume.

3.3.3.2. Verifying creation of a geo-replication session

  1. Log in to the Administration Portal on any source node.
  2. Click StorageVolumes.
  3. Check the Info column for the geo-replication icon.

    If this icon is present, geo-replication has been configured for that volume.

    If this icon is not present, try synchronizing the volume.

3.3.3.3. Synchronizing volume state using the Administration Portal

  1. Log in to the Administration Portal.
  2. Click ComputeVolumes.
  3. Select the volume that you want to synchronize.
  4. Click the Geo-replication sub-tab.
  5. Click Sync.

3.3.4. Scheduling regular backups using geo-replication

  1. Log in to the Administration Portal on any source node.
  2. Click StorageDomains.
  3. Click the name of the storage domain that you want to back up.
  4. Click the Remote Data Sync Setup subtab.
  5. Click Setup.

    The Setup Remote Data Synchronization window opens.

    1. In the Geo-replicated to field, select the backup target.
    2. In the Recurrence field, select a recurrence interval type.

      Valid values are WEEKLY with at least one weekday checkbox selected, or DAILY.

    3. In the Hours and Minutes field, specify the time to start synchronizing.

      Note

      This time is based on the Hosted Engine’s timezone.

    4. Click OK.
  6. Check the Events subtab for the source volume at the time you specified to verify that synchronization works correctly.

3.4. Configuring failover to and failback from a secondary cluster

This section covers how to configure your cluster to fail over to a remote secondary cluster in the event of server failure.

To do this, you must:

3.4.1. Creating a secondary cluster for failover

Install and configure a secondary cluster that can be used in place of the primary cluster in the event of failure.

This secondary cluster can be either of the following configurations:

Red Hat Hyperconverged Infrastructure
See Deploying Red Hat Hyperconverged Infrastructure for details.
Red Hat Gluster Storage configured for use as a Red Hat Virtualization storage domain
See Configuring Red Hat Virtualization with Red Hat Gluster Storage for details. Note that creating a storage domain is not necessary for this use case; the storage domain is imported as part of the failover process.

The storage on the secondary cluster must not be attached to a data center, so that it can be added to the secondary site’s data center during the failover process.

3.4.2. Creating a mapping file between source and target clusters

Follow this section to create a file that maps the storage in your source cluster to the storage in your target cluster.

Red Hat recommends that you create this file immediately after you first deploy your storage, and keep it up to date as your deployment changes. This helps to ensure that everything in your cluster fails over safely in the event of disaster.

  1. Create a playbook to generate the mapping file.

    Create a playbook that passes information about your cluster to the oVirt.disaster-recovery role, using the site, username, password, and ca variables.

    Red Hat recommends creating this file in the /usr/share/ansible/roles/oVirt.disaster-recovery directory of the server that provides ansible and manages failover and failback.

    Example playbook file: dr-ovirt-setup.yml

    ---
    - name: Collect mapping variables
      hosts: localhost
      connection: local
    
      vars:
        site: https://example.engine.redhat.com/ovirt-engine/api
        username: admin@internal
        password: my_password
        ca: /etc/pki/ovirt-engine/ca.pem
        var_file: disaster_recovery_vars.yml
    
      roles:
        - oVirt.disaster-recovery

  2. Generate the mapping file by running the playbook with the generate_mapping tag.

    # ansible-playbook dr-ovirt-setup.yml --tags "generate_mapping"

    This creates the mapping file, disaster_recovery_vars.yml.

  3. Edit disaster_recovery_vars.yml and add information about the secondary cluster.

    See Appendix A: Mapping File Attributes in the Red Hat Virtualization Disaster Recovery Guide for detailed information about attributes used in the mapping file.

3.4.3. Creating a failover playbook between source and target clusters

Create a playbook file that passes the lists of hyperconverged hosts to use as a failover source and target to the oVirt.disaster-recovery role, using the dr_target_host and dr_source_map variables.

Red Hat recommends creating this file in the /usr/share/ansible/roles/oVirt.disaster-recovery directory of the server that provides ansible and manages failover and failback.

Example playbook file: dr-rhv-failover.yml

---
- name: Failover RHV
  hosts: localhost
  connection: local
  vars:
    dr_target_host: secondary
    dr_source_map: primary
  vars_files:
    - disaster_recovery_vars.yml
    - passwords.yml
  roles:
    - oVirt.disaster-recovery

For information about executing failover, see Failing over to a secondary cluster.

3.4.4. Creating a failover cleanup playbook for your primary cluster

Create a playbook file that cleans up your primary cluster so that you can use it as a failback target.

Red Hat recommends creating this file in the /usr/share/ansible/roles/oVirt.disaster-recovery directory of the server that provides ansible and manages failover and failback.

Example playbook file: dr-cleanup.yml

---
- name: Clean RHV
  hosts: localhost
  connection: local
  vars:
    dr_source_map: primary
  vars_files:
    - disaster_recovery_vars.yml
  roles:
    - oVirt.disaster-recovery

For information about executing failback, see Failing back to a primary cluster.

3.4.5. Create a failback playbook between source and target clusters

Create a playbook file that passes the lists of hyperconverged hosts to use as a failback source and target to the oVirt.disaster-recovery role, using the dr_target_host and dr_source_map variables.

Red Hat recommends creating this file in the /usr/share/ansible/roles/oVirt.disaster-recovery directory of the server that provides ansible and manages failover and failback.

Example playbook file: dr-rhv-failback.yml

---
- name: Failback RHV
  hosts: localhost
  connection: local
  vars:
    dr_target_host: primary
    dr_source_map: secondary
  vars_files:
    - disaster_recovery_vars.yml
    - passwords.yml
  roles:
    - oVirt.disaster-recovery

For information about executing failback, see Failing back to a primary cluster.