Chapter 10. Configuring a high-availability cluster by using the ha_cluster RHEL system role

With the ha_cluster system role, you can configure and manage a high-availability cluster that uses the Pacemaker high availability cluster resource manager.

10.1. Variables of the ha_cluster system role

In an ha_cluster system role playbook, you define the variables for a high availability cluster according to the requirements of your cluster deployment.

The variables you can set for an ha_cluster system role are as follows:

ha_cluster_enable_repos
A boolean flag that enables the repositories containing the packages that are needed by the ha_cluster system role. When this variable is set to true, the default value, you must have active subscription coverage for RHEL and the RHEL High Availability Add-On on the systems that you will use as your cluster members or the system role will fail.
ha_cluster_manage_firewall

(RHEL 9.2 and later) A boolean flag that determines whether the ha_cluster system role manages the firewall. When ha_cluster_manage_firewall is set to true, the firewall high availability service and the fence-virt port are enabled. When ha_cluster_manage_firewall is set to false, the ha_cluster system role does not manage the firewall. If your system is running the firewalld service, you must set the parameter to true in your playbook.

You can use the ha_cluster_manage_firewall parameter to add ports, but you cannot use the parameter to remove ports. To remove ports, use the firewall system role directly.

As of RHEL 9.2, the firewall is no longer configured by default, because it is configured only when ha_cluster_manage_firewall is set to true.

ha_cluster_manage_selinux

(RHEL 9.2 and later) A boolean flag that determines whether the ha_cluster system role manages the ports belonging to the firewall high availability service using the selinux system role. When ha_cluster_manage_selinux is set to true, the ports belonging to the firewall high availability service are associated with the SELinux port type cluster_port_t. When ha_cluster_manage_selinux is set to false, the ha_cluster system role does not manage SELinux.

If your system is running the selinux service, you must set this parameter to true in your playbook. Firewall configuration is a prerequisite for managing SELinux. If the firewall is not installed, the managing SELinux policy is skipped.

You can use the ha_cluster_manage_selinux parameter to add policy, but you cannot use the parameter to remove policy. To remove policy, use the selinux system role directly.

ha_cluster_cluster_present

A boolean flag which, if set to true, determines that HA cluster will be configured on the hosts according to the variables passed to the role. Any cluster configuration not specified in the role and not supported by the role will be lost.

If ha_cluster_cluster_present is set to false, all HA cluster configuration will be removed from the target hosts.

The default value of this variable is true.

The following example playbook removes all cluster configuration on node1 and node2

- hosts: node1 node2
  vars:
    ha_cluster_cluster_present: false

  roles:
    - rhel-system-roles.ha_cluster
ha_cluster_start_on_boot
A boolean flag that determines whether cluster services will be configured to start on boot. The default value of this variable is true.
ha_cluster_fence_agent_packages
List of fence agent packages to install. The default value of this variable is fence-agents-all, fence-virt.
ha_cluster_extra_packages

List of additional packages to be installed. The default value of this variable is no packages.

This variable can be used to install additional packages not installed automatically by the role, for example custom resource agents.

It is possible to specify fence agents as members of this list. However, ha_cluster_fence_agent_packages is the recommended role variable to use for specifying fence agents, so that its default value is overridden.

ha_cluster_hacluster_password
A string value that specifies the password of the hacluster user. The hacluster user has full access to a cluster. To protect sensitive data, vault encrypt the password, as described in Encrypting content with Ansible Vault. There is no default password value, and this variable must be specified.
ha_cluster_hacluster_qdevice_password
(RHEL 9.3 and later) A string value that specifies the password of the hacluster user for a quorum device. This parameter is needed only if the ha_cluster_quorum parameter is configured to use a quorum device of type net and the password of the hacluster user on the quorum device is different from the password of the hacluster user specified with the ha_cluster_hacluster_password parameter. The hacluster user has full access to a cluster. To protect sensitive data, vault encrypt the password, as described in Encrypting content with Ansible Vault. There is no default value for this password.
ha_cluster_corosync_key_src

The path to Corosync authkey file, which is the authentication and encryption key for Corosync communication. It is highly recommended that you have a unique authkey value for each cluster. The key should be 256 bytes of random data.

If you specify a key for this variable, it is recommended that you vault encrypt the key, as described in Encrypting content with Ansible Vault.

If no key is specified, a key already present on the nodes will be used. If nodes do not have the same key, a key from one node will be distributed to other nodes so that all nodes have the same key. If no node has a key, a new key will be generated and distributed to the nodes.

If this variable is set, ha_cluster_regenerate_keys is ignored for this key.

The default value of this variable is null.

ha_cluster_pacemaker_key_src

The path to the Pacemaker authkey file, which is the authentication and encryption key for Pacemaker communication. It is highly recommended that you have a unique authkey value for each cluster. The key should be 256 bytes of random data.

If you specify a key for this variable, it is recommended that you vault encrypt the key, as described in Encrypting content with Ansible Vault.

If no key is specified, a key already present on the nodes will be used. If nodes do not have the same key, a key from one node will be distributed to other nodes so that all nodes have the same key. If no node has a key, a new key will be generated and distributed to the nodes.

If this variable is set, ha_cluster_regenerate_keys is ignored for this key.

The default value of this variable is null.

ha_cluster_fence_virt_key_src

The path to the fence-virt or fence-xvm pre-shared key file, which is the location of the authentication key for the fence-virt or fence-xvm fence agent.

If you specify a key for this variable, it is recommended that you vault encrypt the key, as described in Encrypting content with Ansible Vault.

If no key is specified, a key already present on the nodes will be used. If nodes do not have the same key, a key from one node will be distributed to other nodes so that all nodes have the same key. If no node has a key, a new key will be generated and distributed to the nodes. If the ha_cluster system role generates a new key in this fashion, you should copy the key to your nodes' hypervisor to ensure that fencing works.

If this variable is set, ha_cluster_regenerate_keys is ignored for this key.

The default value of this variable is null.

ha_cluster_pcsd_public_key_srcr, ha_cluster_pcsd_private_key_src

The path to the pcsd TLS certificate and private key. If this is not specified, a certificate-key pair already present on the nodes will be used. If a certificate-key pair is not present, a random new one will be generated.

If you specify a private key value for this variable, it is recommended that you vault encrypt the key, as described in Encrypting content with Ansible Vault.

If these variables are set, ha_cluster_regenerate_keys is ignored for this certificate-key pair.

The default value of these variables is null.

ha_cluster_pcsd_certificates

(RHEL 9.2 and later) Creates a pcsd private key and certificate using the certificate system role.

If your system is not configured with a pcsd private key and certificate, you can create them in one of two ways:

  • Set the ha_cluster_pcsd_certificates variable. When you set the ha_cluster_pcsd_certificates variable, the certificate system role is used internally and it creates the private key and certificate for pcsd as defined.
  • Do not set the ha_cluster_pcsd_public_key_src, ha_cluster_pcsd_private_key_src, or the ha_cluster_pcsd_certificates variables. If you do not set any of these variables, the ha_cluster system role will create pcsd certificates by means of pcsd itself. The value of ha_cluster_pcsd_certificates is set to the value of the variable certificate_requests as specified in the certificate system role. For more information about the certificate system role, see Requesting certificates using RHEL system roles.

The following operational considerations apply to the use of the ha_cluster_pcsd_certificate variable:

  • Unless you are using IPA and joining the systems to an IPA domain, the certificate system role creates self-signed certificates. In this case, you must explicitly configure trust settings outside of the context of RHEL system roles. System roles do not support configuring trust settings.
  • When you set the ha_cluster_pcsd_certificates variable, do not set the ha_cluster_pcsd_public_key_src and ha_cluster_pcsd_private_key_src variables.
  • When you set the ha_cluster_pcsd_certificates variable, ha_cluster_regenerate_keys is ignored for this certificate - key pair.

The default value of this variable is [].

For an example ha_cluster system role playbook that creates TLS certificates and key files in a high availability cluster, see Creating pcsd TLS certificates and key files for a high availability cluster.

ha_cluster_regenerate_keys
A boolean flag which, when set to true, determines that pre-shared keys and TLS certificates will be regenerated. For more information about when keys and certificates will be regenerated, see the descriptions of the ha_cluster_corosync_key_src, ha_cluster_pacemaker_key_src, ha_cluster_fence_virt_key_src, ha_cluster_pcsd_public_key_src, and ha_cluster_pcsd_private_key_src variables.
The default value of this variable is false.
ha_cluster_pcs_permission_list

Configures permissions to manage a cluster using pcsd. The items you configure with this variable are as follows:

  • type - user or group
  • name - user or group name
  • allow_list - Allowed actions for the specified user or group:

    • read - View cluster status and settings
    • write - Modify cluster settings except permissions and ACLs
    • grant - Modify cluster permissions and ACLs
    • full - Unrestricted access to a cluster including adding and removing nodes and access to keys and certificates

The structure of the ha_cluster_pcs_permission_list variable and its default values are as follows:

ha_cluster_pcs_permission_list:
  - type: group
    name: hacluster
    allow_list:
      - grant
      - read
      - write
ha_cluster_cluster_name
The name of the cluster. This is a string value with a default of my-cluster.
ha_cluster_transport

(RHEL 9.1 and later) Sets the cluster transport method. The items you configure with this variable are as follows:

  • type (optional) - Transport type: knet, udp, or udpu. The udp and udpu transport types support only one link. Encryption is always disabled for udp and udpu. Defaults to knet if not specified.
  • options (optional) - List of name-value dictionaries with transport options.
  • links (optional) - List of list of name-value dictionaries. Each list of name-value dictionaries holds options for one Corosync link. It is recommended that you set the linknumber value for each link. Otherwise, the first list of dictionaries is assigned by default to the first link, the second one to the second link, and so on.
  • compression (optional) - List of name-value dictionaries configuring transport compression. Supported only with the knet transport type.
  • crypto (optional) - List of name-value dictionaries configuring transport encryption. By default, encryption is enabled. Supported only with the knet transport type.

For a list of allowed options, see the pcs -h cluster setup help page or the setup description in the cluster section of the pcs(8) man page. For more detailed descriptions, see the corosync.conf(5) man page.

The structure of the ha_cluster_transport variable is as follows:

ha_cluster_transport:
  type: knet
  options:
    - name: option1_name
      value: option1_value
    - name: option2_name
      value: option2_value
  links:
    -
      - name: option1_name
        value: option1_value
      - name: option2_name
        value: option2_value
    -
      - name: option1_name
        value: option1_value
      - name: option2_name
        value: option2_value
  compression:
    - name: option1_name
      value: option1_value
    - name: option2_name
      value: option2_value
  crypto:
    - name: option1_name
      value: option1_value
    - name: option2_name
      value: option2_value

For an example ha_cluster system role playbook that configures a transport method, see Configuring Corosync values in a high availability cluster.

ha_cluster_totem

(RHEL 9.1 and later) Configures Corosync totem. For a list of allowed options, see the pcs -h cluster setup help page or the setup description in the cluster section of the pcs(8) man page. For a more detailed description, see the corosync.conf(5) man page.

The structure of the ha_cluster_totem variable is as follows:

ha_cluster_totem:
  options:
    - name: option1_name
      value: option1_value
    - name: option2_name
      value: option2_value

For an example ha_cluster system role playbook that configures a Corosync totem, see Configuring Corosync values in a high availability cluster.

ha_cluster_quorum

(RHEL 9.1 and later) Configures cluster quorum. You can configure the following items for cluster quorum:

  • options (optional) - List of name-value dictionaries configuring quorum. Allowed options are: auto_tie_breaker, last_man_standing, last_man_standing_window, and wait_for_all. For information about quorum options, see the votequorum(5) man page.
  • device (optional) - (RHEL 9.2 and later) Configures the cluster to use a quorum device. By default, no quorum device is used.

    • model (mandatory) - Specifies a quorum device model. Only net is supported
    • model_options (optional) - List of name-value dictionaries configuring the specified quorum device model. For model net, you must specify host and algorithm options.

      Use the pcs-address option to set a custom pcsd address and port to connect to the qnetd host. If you do not specify this option, the role connects to the default pcsd port on the host.

    • generic_options (optional) - List of name-value dictionaries setting quorum device options that are not model specific.
    • heuristics_options (optional) - List of name-value dictionaries configuring quorum device heuristics.

      For information about quorum device options, see the corosync-qdevice(8) man page. The generic options are sync_timeout and timeout. For model net options see the quorum.device.net section. For heuristics options, see the quorum.device.heuristics section.

      To regenerate a quorum device TLS certificate, set the ha_cluster_regenerate_keys variable to true.

The structure of the ha_cluster_quorum variable is as follows:

ha_cluster_quorum:
  options:
    - name: option1_name
      value: option1_value
    - name: option2_name
      value: option2_value
  device:
    model: string
    model_options:
      - name: option1_name
        value: option1_value
      - name: option2_name
        value: option2_value
    generic_options:
      - name: option1_name
        value: option1_value
      - name: option2_name
        value: option2_value
    heuristics_options:
      - name: option1_name
        value: option1_value
      - name: option2_name
        value: option2_value

For an example ha_cluster system role playbook that configures cluster quorum, see Configuring Corosync values in a high availability cluster. For an example ha_cluster system role playbook that configures a cluster using a quorum device, see Configuring a high availability cluster using a quorum device.

ha_cluster_sbd_enabled

(RHEL 9.1 and later) A boolean flag which determines whether the cluster can use the SBD node fencing mechanism. The default value of this variable is false.

For an example ha_cluster system role playbook that enables SBD, see Configuring a high availability cluster with SBD node fencing.

ha_cluster_sbd_options

(RHEL 9.1 and later) List of name-value dictionaries specifying SBD options. Supported options are:

  • delay-start - defaults to no
  • startmode - defaults to always
  • timeout-action - defaults to flush,reboot
  • watchdog-timeout - defaults to 5

    For information about these options, see the Configuration via environment section of the sbd(8) man page.

For an example ha_cluster system role playbook that configures SBD options, see Configuring a high availability cluster with SBD node fencing.

When using SBD, you can optionally configure watchdog and SBD devices for each node in an inventory. For information about configuring watchdog and SBD devices in an inventory file, see Specifying an inventory for the ha_cluster system role.

ha_cluster_cluster_properties

List of sets of cluster properties for Pacemaker cluster-wide configuration. Only one set of cluster properties is supported.

The structure of a set of cluster properties is as follows:

ha_cluster_cluster_properties:
  - attrs:
      - name: property1_name
        value: property1_value
      - name: property2_name
        value: property2_value

By default, no properties are set.

The following example playbook configures a cluster consisting of node1 and node2 and sets the stonith-enabled and no-quorum-policy cluster properties.

- hosts: node1 node2
  vars:
    ha_cluster_cluster_name: my-new-cluster
    ha_cluster_hacluster_password: password
    ha_cluster_cluster_properties:
      - attrs:
          - name: stonith-enabled
            value: 'true'
          - name: no-quorum-policy
            value: stop

  roles:
    - rhel-system-roles.ha_cluster
ha_cluster_resource_primitives

This variable defines pacemaker resources configured by the system role, including stonith resources, including stonith resources. You can configure the following items for each resource:

  • id (mandatory) - ID of a resource.
  • agent (mandatory) - Name of a resource or stonith agent, for example ocf:pacemaker:Dummy or stonith:fence_xvm. It is mandatory to specify stonith: for stonith agents. For resource agents, it is possible to use a short name, such as Dummy, instead of ocf:pacemaker:Dummy. However, if several agents with the same short name are installed, the role will fail as it will be unable to decide which agent should be used. Therefore, it is recommended that you use full names when specifying a resource agent.
  • instance_attrs (optional) - List of sets of the resource’s instance attributes. Currently, only one set is supported. The exact names and values of attributes, as well as whether they are mandatory or not, depend on the resource or stonith agent.
  • meta_attrs (optional) - List of sets of the resource’s meta attributes. Currently, only one set is supported.
  • copy_operations_from_agent (optional) - (RHEL 9.3 and later) Resource agents usually define default settings for resource operations, such as interval and timeout, optimized for the specific agent. If this variable is set to true, then those settings are copied to the resource configuration. Otherwise, clusterwide defaults apply to the resource. If you also define resource operation defaults for the resource with the ha_cluster_resource_operation_defaults role variable, you can set this to false. The default value of this variable is true.
  • operations (optional) - List of the resource’s operations.

    • action (mandatory) - Operation action as defined by pacemaker and the resource or stonith agent.
    • attrs (mandatory) - Operation options, at least one option must be specified.

The structure of the resource definition that you configure with the ha_cluster system role is as follows:

  - id: resource-id
    agent: resource-agent
    instance_attrs:
      - attrs:
          - name: attribute1_name
            value: attribute1_value
          - name: attribute2_name
            value: attribute2_value
    meta_attrs:
      - attrs:
          - name: meta_attribute1_name
            value: meta_attribute1_value
          - name: meta_attribute2_name
            value: meta_attribute2_value
    copy_operations_from_agent: bool
    operations:
      - action: operation1-action
        attrs:
          - name: operation1_attribute1_name
            value: operation1_attribute1_value
          - name: operation1_attribute2_name
            value: operation1_attribute2_value
      - action: operation2-action
        attrs:
          - name: operation2_attribute1_name
            value: operation2_attribute1_value
          - name: operation2_attribute2_name
            value: operation2_attribute2_value

By default, no resources are defined.

For an example ha_cluster system role playbook that includes resource configuration, see Configuring a high availability cluster with fencing and resources.

ha_cluster_resource_groups

This variable defines pacemaker resource groups configured by the system role. You can configure the following items for each resource group:

  • id (mandatory) - ID of a group.
  • resources (mandatory) - List of the group’s resources. Each resource is referenced by its ID and the resources must be defined in the ha_cluster_resource_primitives variable. At least one resource must be listed.
  • meta_attrs (optional) - List of sets of the group’s meta attributes. Currently, only one set is supported.

The structure of the resource group definition that you configure with the ha_cluster system role is as follows:

ha_cluster_resource_groups:
  - id: group-id
    resource_ids:
      - resource1-id
      - resource2-id
    meta_attrs:
      - attrs:
          - name: group_meta_attribute1_name
            value: group_meta_attribute1_value
          - name: group_meta_attribute2_name
            value: group_meta_attribute2_value

By default, no resource groups are defined.

For an example ha_cluster system role playbook that includes resource group configuration, see Configuring a high availability cluster with fencing and resources.

ha_cluster_resource_clones

This variable defines pacemaker resource clones configured by the system role. You can configure the following items for a resource clone:

  • resource_id (mandatory) - Resource to be cloned. The resource must be defined in the ha_cluster_resource_primitives variable or the ha_cluster_resource_groups variable.
  • promotable (optional) - Indicates whether the resource clone to be created is a promotable clone, indicated as true or false.
  • id (optional) - Custom ID of the clone. If no ID is specified, it will be generated. A warning will be displayed if this option is not supported by the cluster.
  • meta_attrs (optional) - List of sets of the clone’s meta attributes. Currently, only one set is supported.

The structure of the resource clone definition that you configure with the ha_cluster system role is as follows:

ha_cluster_resource_clones:
  - resource_id: resource-to-be-cloned
    promotable: true
    id: custom-clone-id
    meta_attrs:
      - attrs:
          - name: clone_meta_attribute1_name
            value: clone_meta_attribute1_value
          - name: clone_meta_attribute2_name
            value: clone_meta_attribute2_value

By default, no resource clones are defined.

For an example ha_cluster system role playbook that includes resource clone configuration, see Configuring a high availability cluster with fencing and resources.

ha_cluster_resource_defaults

(RHEL 9.3 and later) This variable defines sets of resource defaults. You can define multiple sets of defaults and apply them to resources of specific agents using rules. The defaults you specify with the ha_cluster_resource_defaults variable do not apply to resources which override them with their own defined values.

Only meta attributes can be specified as defaults.

You can configure the following items for each defaults set:

  • id (optional) - ID of the defaults set. If not specified, it is autogenerated.
  • rule (optional) - Rule written using pcs syntax defining when and for which resources the set applies. For information on specifying a rule, see the resource defaults set create section of the pcs(8) man page.
  • score (optional) - Weight of the defaults set.
  • attrs (optional) - Meta attributes applied to resources as defaults.

The structure of the ha_cluster_resource_defaults variable is as follows:

ha_cluster_resource_defaults:
  meta_attrs:
    - id: defaults-set-1-id
      rule: rule-string
      score: score-value
      attrs:
        - name: meta_attribute1_name
          value: meta_attribute1_value
        - name: meta_attribute2_name
          value: meta_attribute2_value
    - id: defaults-set-2-id
      rule: rule-string
      score: score-value
      attrs:
        - name: meta_attribute3_name
          value: meta_attribute3_value
        - name: meta_attribute4_name
          value: meta_attribute4_value

For an example ha_cluster system role playbook that configures resource defaults, see Configuring a high availability cluster with resource and resource operation defaults.

ha_cluster_resource_operation_defaults

(RHEL 9.3 and later) This variable defines sets of resource operation defaults. You can define multiple sets of defaults and apply them to resources of specific agents and specific resource operations using rules. The defaults you specify with the ha_cluster_resource_operation_defaults variable do not apply to resource operations which override them with their own defined values. By default, the ha_cluster system role configures resources to define their own values for resource operations. For information about overriding these defaults with the ha_cluster_resource_operations_defaults variable, see the description of the copy_operations_from_agent item in ha_cluster_resource_primitives.

Only meta attributes can be specified as defaults.

The structure of the ha_cluster_resource_operations_defaults variable is the same as the structure for the ha_cluster_resource_defaults variable, with the exception of how you specify a rule. For information about specifying a rule to describe the resource operation to which a set applies, see the resource op defaults set create section of the pcs(8) man page.

ha_cluster_constraints_location

This variable defines resource location constraints. Resource location constraints indicate which nodes a resource can run on. You can specify a resources specified by a resource ID or by a pattern, which can match more than one resource. You can specify a node by a node name or by a rule.

You can configure the following items for a resource location constraint:

  • resource (mandatory) - Specification of a resource the constraint applies to.
  • node (mandatory) - Name of a node the resource should prefer or avoid.
  • id (optional) - ID of the constraint. If not specified, it will be autogenerated.
  • options (optional) - List of name-value dictionaries.

    • score - Sets the weight of the constraint.

      • A positive score value means the resource prefers running on the node.
      • A negative score value means the resource should avoid running on the node.
      • A score value of -INFINITY means the resource must avoid running on the node.
      • If score is not specified, the score value defaults to INFINITY.

By default no resource location constraints are defined.

The structure of a resource location constraint specifying a resource ID and node name is as follows:

ha_cluster_constraints_location:
  - resource:
      id: resource-id
    node: node-name
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: option-name
        value: option-value

The items that you configure for a resource location constraint that specifies a resource pattern are the same items that you configure for a resource location constraint that specifies a resource ID, with the exception of the resource specification itself. The item that you specify for the resource specification is as follows:

  • pattern (mandatory) - POSIX extended regular expression resource IDs are matched against.

The structure of a resource location constraint specifying a resource pattern and node name is as follows:

ha_cluster_constraints_location:
  - resource:
      pattern: resource-pattern
    node: node-name
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: resource-discovery
        value: resource-discovery-value

You can configure the following items for a resource location constraint that specifies a resource ID and a rule:

  • resource (mandatory) - Specification of a resource the constraint applies to.

    • id (mandatory) - Resource ID.
    • role (optional) - The resource role to which the constraint is limited: Started, Unpromoted, Promoted.
  • rule (mandatory) - Constraint rule written using pcs syntax. For further information, see the constraint location section of the pcs(8) man page.
  • Other items to specify have the same meaning as for a resource constraint that does not specify a rule.

The structure of a resource location constraint that specifies a resource ID and a rule is as follows:

ha_cluster_constraints_location:
  - resource:
      id: resource-id
      role: resource-role
    rule: rule-string
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: resource-discovery
        value: resource-discovery-value

The items that you configure for a resource location constraint that specifies a resource pattern and a rule are the same items that you configure for a resource location constraint that specifies a resource ID and a rule, with the exception of the resource specification itself. The item that you specify for the resource specification is as follows:

  • pattern (mandatory) - POSIX extended regular expression resource IDs are matched against.

The structure of a resource location constraint that specifies a resource pattern and a rule is as follows:

ha_cluster_constraints_location:
  - resource:
      pattern: resource-pattern
      role: resource-role
    rule: rule-string
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: resource-discovery
        value: resource-discovery-value

For an example ha_cluster system role playbook that creates a cluster with resource constraints, see Configuring a high availability cluster with resource constraints.

ha_cluster_constraints_colocation

This variable defines resource colocation constraints. Resource colocation constraints indicate that the location of one resource depends on the location of another one. There are two types of colocation constraints: a simple colocation constraint for two resources, and a set colocation constraint for multiple resources.

You can configure the following items for a simple resource colocation constraint:

  • resource_follower (mandatory) - A resource that should be located relative to resource_leader.

    • id (mandatory) - Resource ID.
    • role (optional) - The resource role to which the constraint is limited: Started, Unpromoted, Promoted.
  • resource_leader (mandatory) - The cluster will decide where to put this resource first and then decide where to put resource_follower.

    • id (mandatory) - Resource ID.
    • role (optional) - The resource role to which the constraint is limited: Started, Unpromoted, Promoted.
  • id (optional) - ID of the constraint. If not specified, it will be autogenerated.
  • options (optional) - List of name-value dictionaries.

    • score - Sets the weight of the constraint.

      • Positive score values indicate the resources should run on the same node.
      • Negative score values indicate the resources should run on different nodes.
      • A score value of +INFINITY indicates the resources must run on the same node.
      • A score value of -INFINITY indicates the resources must run on different nodes.
      • If score is not specified, the score value defaults to INFINITY.

By default no resource colocation constraints are defined.

The structure of a simple resource colocation constraint is as follows:

ha_cluster_constraints_colocation:
  - resource_follower:
      id: resource-id1
      role: resource-role1
    resource_leader:
      id: resource-id2
      role: resource-role2
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: option-name
        value: option-value

You can configure the following items for a resource set colocation constraint:

  • resource_sets (mandatory) - List of resource sets.

    • resource_ids (mandatory) - List of resources in a set.
    • options (optional) - List of name-value dictionaries fine-tuning how resources in the sets are treated by the constraint.
  • id (optional) - Same values as for a simple colocation constraint.
  • options (optional) - Same values as for a simple colocation constraint.

The structure of a resource set colocation constraint is as follows:

ha_cluster_constraints_colocation:
  - resource_sets:
      - resource_ids:
          - resource-id1
          - resource-id2
        options:
          - name: option-name
            value: option-value
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: option-name
        value: option-value

For an example ha_cluster system role playbook that creates a cluster with resource constraints, see Configuring a high availability cluster with resource constraints.

ha_cluster_constraints_order

This variable defines resource order constraints. Resource order constraints indicate the order in which certain resource actions should occur. There are two types of resource order constraints: a simple order constraint for two resources, and a set order constraint for multiple resources.

You can configure the following items for a simple resource order constraint:

  • resource_first (mandatory) - Resource that the resource_then resource depends on.

    • id (mandatory) - Resource ID.
    • action (optional) - The action that must complete before an action can be initiated for the resource_then resource. Allowed values: start, stop, promote, demote.
  • resource_then (mandatory) - The dependent resource.

    • id (mandatory) - Resource ID.
    • action (optional) - The action that the resource can execute only after the action on the resource_first resource has completed. Allowed values: start, stop, promote, demote.
  • id (optional) - ID of the constraint. If not specified, it will be autogenerated.
  • options (optional) - List of name-value dictionaries.

By default no resource order constraints are defined.

The structure of a simple resource order constraint is as follows:

ha_cluster_constraints_order:
  - resource_first:
      id: resource-id1
      action: resource-action1
    resource_then:
      id: resource-id2
      action: resource-action2
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: option-name
        value: option-value

You can configure the following items for a resource set order constraint:

  • resource_sets (mandatory) - List of resource sets.

    • resource_ids (mandatory) - List of resources in a set.
    • options (optional) - List of name-value dictionaries fine-tuning how resources in the sets are treated by the constraint.
  • id (optional) - Same values as for a simple order constraint.
  • options (optional) - Same values as for a simple order constraint.

The structure of a resource set order constraint is as follows:

ha_cluster_constraints_order:
  - resource_sets:
      - resource_ids:
          - resource-id1
          - resource-id2
        options:
          - name: option-name
            value: option-value
    id: constraint-id
    options:
      - name: score
        value: score-value
      - name: option-name
        value: option-value

For an example ha_cluster system role playbook that creates a cluster with resource constraints, see Configuring a high availability cluster with resource constraints.

ha_cluster_constraints_ticket

This variable defines resource ticket constraints. Resource ticket constraints indicate the resources that depend on a certain ticket. There are two types of resource ticket constraints: a simple ticket constraint for one resource, and a ticket order constraint for multiple resources.

You can configure the following items for a simple resource ticket constraint:

  • resource (mandatory) - Specification of a resource the constraint applies to.

    • id (mandatory) - Resource ID.
    • role (optional) - The resource role to which the constraint is limited: Started, Unpromoted, Promoted.
  • ticket (mandatory) - Name of a ticket the resource depends on.
  • id (optional) - ID of the constraint. If not specified, it will be autogenerated.
  • options (optional) - List of name-value dictionaries.

    • loss-policy (optional) - Action to perform on the resource if the ticket is revoked.

By default no resource ticket constraints are defined.

The structure of a simple resource ticket constraint is as follows:

ha_cluster_constraints_ticket:
  - resource:
      id: resource-id
      role: resource-role
    ticket: ticket-name
    id: constraint-id
    options:
      - name: loss-policy
        value: loss-policy-value
      - name: option-name
        value: option-value

You can configure the following items for a resource set ticket constraint:

  • resource_sets (mandatory) - List of resource sets.

    • resource_ids (mandatory) - List of resources in a set.
    • options (optional) - List of name-value dictionaries fine-tuning how resources in the sets are treated by the constraint.
  • ticket (mandatory) - Same value as for a simple ticket constraint.
  • id (optional) - Same value as for a simple ticket constraint.
  • options (optional) - Same values as for a simple ticket constraint.

The structure of a resource set ticket constraint is as follows:

ha_cluster_constraints_ticket:
  - resource_sets:
      - resource_ids:
          - resource-id1
          - resource-id2
        options:
          - name: option-name
            value: option-value
    ticket: ticket-name
    id: constraint-id
    options:
      - name: option-name
        value: option-value

For an example ha_cluster system role playbook that creates a cluster with resource constraints, see Configuring a high availability cluster with resource constraints.

ha_cluster_qnetd

(RHEL 9.1 and later) This variable configures a qnetd host which can then serve as an external quorum device for clusters.

You can configure the following items for a qnetd host:

  • present (optional) - If true, configure a qnetd instance on the host. If false, remove qnetd configuration from the host. The default value is false. If you set this true, you must set ha_cluster_cluster_present to false.
  • start_on_boot (optional) - Configures whether the qnetd instance should start automatically on boot. The default value is true.
  • regenerate_keys (optional) - Set this variable to true to regenerate the qnetd TLS certificate. If you regenerate the certificate, you must either re-run the role for each cluster to connect it to the qnetd host again or run pcs manually.

You cannot run qnetd on a cluster node because fencing would disrupt qnetd operation.

For an example ha_cluster system role playbook that configures a cluster using a quorum device, see Configuring a cluster using a quorum device.

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.2. Specifying an inventory for the ha_cluster system role

When configuring an HA cluster using the ha_cluster system role playbook, you configure the names and addresses of the nodes for the cluster in an inventory.

10.2.1. Configuring node names and addresses in an inventory

For each node in an inventory, you can optionally specify the following items:

  • node_name - the name of a node in a cluster.
  • pcs_address - an address used by pcs to communicate with the node. It can be a name, FQDN or an IP address and it can include a port number.
  • corosync_addresses - list of addresses used by Corosync. All nodes which form a particular cluster must have the same number of addresses and the order of the addresses matters.

The following example shows an inventory with targets node1 and node2. node1 and node2 must be either fully qualified domain names or must otherwise be able to connect to the nodes as when, for example, the names are resolvable through the /etc/hosts file.

all:
  hosts:
    node1:
      ha_cluster:
        node_name: node-A
        pcs_address: node1-address
        corosync_addresses:
          - 192.168.1.11
          - 192.168.2.11
    node2:
      ha_cluster:
        node_name: node-B
        pcs_address: node2-address:2224
        corosync_addresses:
          - 192.168.1.12
          - 192.168.2.12

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.2.2. Configuring watchdog and SBD devices in an inventory

(RHEL 9.1 and later) When using SBD, you can optionally configure watchdog and SBD devices for each node in an inventory. Even though all SBD devices must be shared to and accessible from all nodes, each node can use different names for the devices. Watchdog devices can be different for each node as well. For information about the SBD variables you can set in a system role playbook, see the entries for ha_cluster_sbd_enabled and ha_cluster_sbd_options in Variables of the ha_cluster system role.

For each node in an inventory, you can optionally specify the following items:

  • sbd_watchdog_modules (optional) - (RHEL 9.3 and later) Watchdog kernel modules to be loaded, which create /dev/watchdog* devices. Defaults to empty list if not set.
  • sbd_watchdog_modules_blocklist (optional) - (RHEL 9.3 and later) Watchdog kernel modules to be unloaded and blocked. Defaults to empty list if not set.
  • sbd_watchdog - Watchdog device to be used by SBD. Defaults to /dev/watchdog if not set.
  • sbd_devices - Devices to use for exchanging SBD messages and for monitoring. Defaults to empty list if not set.

The following example shows an inventory that configures watchdog and SBD devices for targets node1 and node2.

all:
  hosts:
    node1:
      ha_cluster:
        sbd_watchdog_modules:
          - module1
          - module2
        sbd_watchdog: /dev/watchdog2
        sbd_devices:
          - /dev/vdx
          - /dev/vdy
    node2:
      ha_cluster:
        sbd_watchdog_modules:
          - module1
        sbd_watchdog_modules_blocklist:
          - module2
        sbd_watchdog: /dev/watchdog1
        sbd_devices:
          - /dev/vdw
          - /dev/vdz

For information about creating a high availability cluster that uses SBD fencing, see Configuring a high availability cluster with SBD node fencing.

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.3. Creating pcsd TLS certificates and key files for a high availability cluster (RHEL 9.2 and later)

You can use the ha_cluster system role to create TLS certificates and key files in a high availability cluster. When you run this playbook, the ha_cluster system role uses the certificate system role internally to manage TLS certificates.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Create TLS certificates and key files in a high availability cluster
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        ha_cluster_pcsd_certificates:
          - name: FILENAME
            common_name: "{{ ansible_hostname }}"
            ca: self-sign

    This playbook configures a cluster running the firewalld and selinux services and creates a self-signed pcsd certificate and private key files in /var/lib/pcsd. The pcsd certificate has the file name FILENAME.crt and the key file is named FILENAME.key.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

10.4. Configuring a high availability cluster running no resources

The following procedure uses the ha_cluster system role, to create a high availability cluster with no fencing configured and which runs no resources.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Create a high availability cluster with no fencing and which runs no resources
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true

    This example playbook file configures a cluster running the firewalld and selinux services with no fencing configured and which runs no resources.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.5. Configuring a high availability cluster with fencing and resources

The following procedure uses the ha_cluster system role to create a high availability cluster that includes a fencing device, cluster resources, resource groups, and a cloned resource.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Create a high availability cluster that includes a fencing device and resources
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        ha_cluster_resource_primitives:
          - id: xvm-fencing
            agent: 'stonith:fence_xvm'
            instance_attrs:
              - attrs:
                  - name: pcmk_host_list
                    value: node1 node2
          - id: simple-resource
            agent: 'ocf:pacemaker:Dummy'
          - id: resource-with-options
            agent: 'ocf:pacemaker:Dummy'
            instance_attrs:
              - attrs:
                  - name: fake
                    value: fake-value
                  - name: passwd
                    value: passwd-value
            meta_attrs:
              - attrs:
                  - name: target-role
                    value: Started
                  - name: is-managed
                    value: 'true'
            operations:
              - action: start
                attrs:
                  - name: timeout
                    value: '30s'
              - action: monitor
                attrs:
                  - name: timeout
                    value: '5'
                  - name: interval
                    value: '1min'
          - id: dummy-1
            agent: 'ocf:pacemaker:Dummy'
          - id: dummy-2
            agent: 'ocf:pacemaker:Dummy'
          - id: dummy-3
            agent: 'ocf:pacemaker:Dummy'
          - id: simple-clone
            agent: 'ocf:pacemaker:Dummy'
          - id: clone-with-options
            agent: 'ocf:pacemaker:Dummy'
        ha_cluster_resource_groups:
          - id: simple-group
            resource_ids:
              - dummy-1
              - dummy-2
            meta_attrs:
              - attrs:
                  - name: target-role
                    value: Started
                  - name: is-managed
                    value: 'true'
          - id: cloned-group
            resource_ids:
              - dummy-3
        ha_cluster_resource_clones:
          - resource_id: simple-clone
          - resource_id: clone-with-options
            promotable: yes
            id: custom-clone-id
            meta_attrs:
              - attrs:
                  - name: clone-max
                    value: '2'
                  - name: clone-node-max
                    value: '1'
          - resource_id: cloned-group
            promotable: yes

    This example playbook file configures a cluster running the firewalld and selinux services. The cluster includes fencing, several resources, and a resource group. It also includes a resource clone for the resource group.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.6. Configuring a high availability cluster with resource and resource operation defaults

(RHEL 9.3 and later) The following procedure uses the ha_cluster system role to create a high availability cluster that defines resource and resource operation defaults.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Create a high availability cluster that defines resource and resource operation defaults
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        # Set a different resource-stickiness value during
        # and outside work hours. This allows resources to
        # automatically move back to their most
        # preferred hosts, but at a time that
        # does not interfere with business activities.
        ha_cluster_resource_defaults:
          meta_attrs:
            - id: core-hours
              rule: date-spec hours=9-16 weekdays=1-5
              score: 2
              attrs:
                - name: resource-stickiness
                  value: INFINITY
            - id: after-hours
              score: 1
              attrs:
                - name: resource-stickiness
                  value: 0
        # Default the timeout on all 10-second-interval
        # monitor actions on IPaddr2 resources to 8 seconds.
        ha_cluster_resource_operation_defaults:
          meta_attrs:
            - rule: resource ::IPaddr2 and op monitor interval=10s
              score: INFINITY
              attrs:
                - name: timeout
                  value: 8s

    This example playbook file configures a cluster running the firewalld and selinux services. The cluster includes resource and resource operation defaults.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.7. Configuring a high availability cluster with resource constraints

The following procedure uses the ha_cluster system role to create a high availability cluster that includes resource location constraints, resource colocation constraints, resource order constraints, and resource ticket constraints.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Create a high availability cluster with resource constraints
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        # In order to use constraints, we need resources the constraints will apply
        # to.
        ha_cluster_resource_primitives:
          - id: xvm-fencing
            agent: 'stonith:fence_xvm'
            instance_attrs:
              - attrs:
                  - name: pcmk_host_list
                    value: node1 node2
          - id: dummy-1
            agent: 'ocf:pacemaker:Dummy'
          - id: dummy-2
            agent: 'ocf:pacemaker:Dummy'
          - id: dummy-3
            agent: 'ocf:pacemaker:Dummy'
          - id: dummy-4
            agent: 'ocf:pacemaker:Dummy'
          - id: dummy-5
            agent: 'ocf:pacemaker:Dummy'
          - id: dummy-6
            agent: 'ocf:pacemaker:Dummy'
        # location constraints
        ha_cluster_constraints_location:
          # resource ID and node name
          - resource:
              id: dummy-1
            node: node1
            options:
              - name: score
                value: 20
          # resource pattern and node name
          - resource:
              pattern: dummy-\d+
            node: node1
            options:
              - name: score
                value: 10
          # resource ID and rule
          - resource:
              id: dummy-2
            rule: '#uname eq node2 and date in_range 2022-01-01 to 2022-02-28'
          # resource pattern and rule
          - resource:
              pattern: dummy-\d+
            rule: node-type eq weekend and date-spec weekdays=6-7
        # colocation constraints
        ha_cluster_constraints_colocation:
          # simple constraint
          - resource_leader:
              id: dummy-3
            resource_follower:
              id: dummy-4
            options:
              - name: score
                value: -5
          # set constraint
          - resource_sets:
              - resource_ids:
                  - dummy-1
                  - dummy-2
              - resource_ids:
                  - dummy-5
                  - dummy-6
                options:
                  - name: sequential
                    value: "false"
            options:
              - name: score
                value: 20
        # order constraints
        ha_cluster_constraints_order:
          # simple constraint
          - resource_first:
              id: dummy-1
            resource_then:
              id: dummy-6
            options:
              - name: symmetrical
                value: "false"
          # set constraint
          - resource_sets:
              - resource_ids:
                  - dummy-1
                  - dummy-2
                options:
                  - name: require-all
                    value: "false"
                  - name: sequential
                    value: "false"
              - resource_ids:
                  - dummy-3
              - resource_ids:
                  - dummy-4
                  - dummy-5
                options:
                  - name: sequential
                    value: "false"
        # ticket constraints
        ha_cluster_constraints_ticket:
          # simple constraint
          - resource:
              id: dummy-1
            ticket: ticket1
            options:
              - name: loss-policy
                value: stop
          # set constraint
          - resource_sets:
              - resource_ids:
                  - dummy-3
                  - dummy-4
                  - dummy-5
            ticket: ticket2
            options:
              - name: loss-policy
                value: fence

    This example playbook file configures a cluster running the firewalld and selinux services. The cluster includes resource location constraints, resource colocation constraints, resource order constraints, and resource ticket constraints.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.8. Configuring Corosync values in a high availability cluster

(RHEL 9.1 and later) The following procedure uses the ha_cluster system role to create a high availability cluster that configures Corosync values.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Create a high availability cluster that configures Corosync values
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        ha_cluster_transport:
          type: knet
          options:
            - name: ip_version
              value: ipv4-6
            - name: link_mode
              value: active
          links:
            -
              - name: linknumber
                value: 1
              - name: link_priority
                value: 5
            -
              - name: linknumber
                value: 0
              - name: link_priority
                value: 10
          compression:
            - name: level
              value: 5
            - name: model
              value: zlib
          crypto:
            - name: cipher
              value: none
            - name: hash
              value: none
        ha_cluster_totem:
          options:
            - name: block_unlisted_ips
              value: 'yes'
            - name: send_join
              value: 0
        ha_cluster_quorum:
          options:
            - name: auto_tie_breaker
              value: 1
            - name: wait_for_all
              value: 1

    This example playbook file configures a cluster running the firewalld and selinux services that configures Corosync properties.

    When creating your playbook file for production, Vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.9. Configuring a high availability cluster with SBD node fencing

(RHEL 9.1 and later) The following procedure uses the ha_cluster system role to create a high availability cluster that uses SBD node fencing.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

This playbook uses an inventory file that loads a watchdog module (supported in RHEL 9.3 and later) as described in Configuring watchdog and SBD devices in an inventory.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Create a high availability cluster that uses SBD node fencing
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        ha_cluster_sbd_enabled: yes
        ha_cluster_sbd_options:
          - name: delay-start
            value: 'no'
          - name: startmode
            value: always
          - name: timeout-action
            value: 'flush,reboot'
          - name: watchdog-timeout
            value: 30
        # Suggested optimal values for SBD timeouts:
        # watchdog-timeout * 2 = msgwait-timeout (set automatically)
        # msgwait-timeout * 1.2 = stonith-timeout
        ha_cluster_cluster_properties:
          - attrs:
              - name: stonith-timeout
                value: 72
        ha_cluster_resource_primitives:
          - id: fence_sbd
            agent: 'stonith:fence_sbd'
            instance_attrs:
              - attrs:
                  # taken from host_vars
                  - name: devices
                    value: "{{ ha_cluster.sbd_devices | join(',') }}"
                  - name: pcmk_delay_base
                    value: 30

    This example playbook file configures a cluster running the firewalld and selinux services that uses SBD fencing and creates the SBD Stonith resource.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.10. Configuring a high availability cluster that uses a quorum device (RHEL 9.2 and later)

To configure a high availability cluster with a separate quorum device by using the ha_cluster system role, first set up the quorum device. After setting up the quorum device, you can use the device in any number of clusters.

10.10.1. Configuring a quorum device

To configure a quorum device using the ha_cluster system role, follow these steps. Note that you cannot run a quorum device on a cluster node.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Configure a quorum device
      hosts: nodeQ
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_present: false
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        ha_cluster_qnetd:
          present: true

    This example playbook file configures a quorum device on a system running the firewalld and selinux services.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.10.2. Configuring a cluster to use a quorum device

To configure a cluster to use a quorum device, follow these steps.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Configure a cluster to use a quorum device
      hosts: node1 node2
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_cluster_name: my-new-cluster
        ha_cluster_hacluster_password: <password>
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        ha_cluster_quorum:
          device:
            model: net
            model_options:
              - name: host
                value: nodeQ
              - name: algorithm
                value: lms

    This example playbook file configures a cluster running the firewalld and selinux services that uses a quorum device.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory

10.11. Configuring an Apache HTTP server in a high availability cluster with the ha_cluster system role

This procedure configures an active/passive Apache HTTP server in a two-node Red Hat Enterprise Linux High Availability Add-On cluster using the ha_cluster system role.

Warning

The ha_cluster system role replaces any existing cluster configuration on the specified nodes. Any settings not specified in the role will be lost.

Prerequisites

Procedure

  1. Create a playbook file, for example ~/playbook.yml, with the following content:

    ---
    - name: Configure active/passive Apache server in a high availability cluster
      hosts: z1.example.com z2.example.com
      roles:
        - rhel-system-roles.ha_cluster
      vars:
        ha_cluster_hacluster_password: <password>
        ha_cluster_cluster_name: my_cluster
        ha_cluster_manage_firewall: true
        ha_cluster_manage_selinux: true
        ha_cluster_fence_agent_packages:
          - fence-agents-apc-snmp
        ha_cluster_resource_primitives:
          - id: myapc
            agent: stonith:fence_apc_snmp
            instance_attrs:
              - attrs:
                  - name: ipaddr
                    value: zapc.example.com
                  - name: pcmk_host_map
                    value: z1.example.com:1;z2.example.com:2
                  - name: login
                    value: apc
                  - name: passwd
                    value: apc
          - id: my_lvm
            agent: ocf:heartbeat:LVM-activate
            instance_attrs:
              - attrs:
                  - name: vgname
                    value: my_vg
                  - name: vg_access_mode
                    value: system_id
          - id: my_fs
            agent: Filesystem
            instance_attrs:
              - attrs:
                  - name: device
                    value: /dev/my_vg/my_lv
                  - name: directory
                    value: /var/www
                  - name: fstype
                    value: xfs
          - id: VirtualIP
            agent: IPaddr2
            instance_attrs:
              - attrs:
                  - name: ip
                    value: 198.51.100.3
                  - name: cidr_netmask
                    value: 24
          - id: Website
            agent: apache
            instance_attrs:
              - attrs:
                  - name: configfile
                    value: /etc/httpd/conf/httpd.conf
                  - name: statusurl
                    value: http://127.0.0.1/server-status
        ha_cluster_resource_groups:
          - id: apachegroup
            resource_ids:
              - my_lvm
              - my_fs
              - VirtualIP
              - Website

    This example playbook file configures a previously-created Apache HTTP server in an active/passive two-node HA cluster running the firewalld and selinux services.

    This example uses an APC power switch with a host name of zapc.example.com. If the cluster does not use any other fence agents, you can optionally list only the fence agents your cluster requires when defining the ha_cluster_fence_agent_packages variable, as in this example.

    When creating your playbook file for production, vault encrypt the password, as described in Encrypting content with Ansible Vault.

  2. Validate the playbook syntax:

    $ ansible-playbook --syntax-check ~/playbook.yml

    Note that this command only validates the syntax and does not protect against a wrong but valid configuration.

  3. Run the playbook:

    $ ansible-playbook ~/playbook.yml
  4. When you use the apache resource agent to manage Apache, it does not use systemd. Because of this, you must edit the logrotate script supplied with Apache so that it does not use systemctl to reload Apache.

    Remove the following line in the /etc/logrotate.d/httpd file on each node in the cluster.

    # /bin/systemctl reload httpd.service > /dev/null 2>/dev/null || true

    Replace the line you removed with the following three lines, specifying /var/run/httpd-website.pid as the PID file path where website is the name of the Apache resource. In this example, the Apache resource name is Website.

    /usr/bin/test -f /var/run/httpd-Website.pid >/dev/null 2>/dev/null &&
    /usr/bin/ps -q $(/usr/bin/cat /var/run/httpd-Website.pid) >/dev/null 2>/dev/null &&
    /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -c "PidFile /var/run/httpd-Website.pid" -k graceful > /dev/null 2>/dev/null || true

Verification

  1. From one of the nodes in the cluster, check the status of the cluster. Note that all four resources are running on the same node, z1.example.com.

    If you find that the resources you configured are not running, you can run the pcs resource debug-start resource command to test the resource configuration.

    [root@z1 ~]# pcs status
    Cluster name: my_cluster
    Last updated: Wed Jul 31 16:38:51 2013
    Last change: Wed Jul 31 16:42:14 2013 via crm_attribute on z1.example.com
    Stack: corosync
    Current DC: z2.example.com (2) - partition with quorum
    Version: 1.1.10-5.el7-9abe687
    2 Nodes configured
    6 Resources configured
    
    Online: [ z1.example.com z2.example.com ]
    
    Full list of resources:
     myapc  (stonith:fence_apc_snmp):       Started z1.example.com
     Resource Group: apachegroup
         my_lvm     (ocf::heartbeat:LVM-activate):   Started z1.example.com
         my_fs      (ocf::heartbeat:Filesystem):    Started z1.example.com
         VirtualIP  (ocf::heartbeat:IPaddr2):       Started z1.example.com
         Website    (ocf::heartbeat:apache):        Started z1.example.com
  2. Once the cluster is up and running, you can point a browser to the IP address you defined as the IPaddr2 resource to view the sample display, consisting of the simple word "Hello".

    Hello
  3. To test whether the resource group running on z1.example.com fails over to node z2.example.com, put node z1.example.com in standby mode, after which the node will no longer be able to host resources.

    [root@z1 ~]# pcs node standby z1.example.com
  4. After putting node z1 in standby mode, check the cluster status from one of the nodes in the cluster. Note that the resources should now all be running on z2.

    [root@z1 ~]# pcs status
    Cluster name: my_cluster
    Last updated: Wed Jul 31 17:16:17 2013
    Last change: Wed Jul 31 17:18:34 2013 via crm_attribute on z1.example.com
    Stack: corosync
    Current DC: z2.example.com (2) - partition with quorum
    Version: 1.1.10-5.el7-9abe687
    2 Nodes configured
    6 Resources configured
    
    Node z1.example.com (1): standby
    Online: [ z2.example.com ]
    
    Full list of resources:
    
     myapc  (stonith:fence_apc_snmp):       Started z1.example.com
     Resource Group: apachegroup
         my_lvm     (ocf::heartbeat:LVM-activate):   Started z2.example.com
         my_fs      (ocf::heartbeat:Filesystem):    Started z2.example.com
         VirtualIP  (ocf::heartbeat:IPaddr2):       Started z2.example.com
         Website    (ocf::heartbeat:apache):        Started z2.example.com

    The web site at the defined IP address should still display, without interruption.

  5. To remove z1 from standby mode, enter the following command.

    [root@z1 ~]# pcs node unstandby z1.example.com
    Note

    Removing a node from standby mode does not in itself cause the resources to fail back over to that node. This will depend on the resource-stickiness value for the resources. For information about the resource-stickiness meta attribute, see Configuring a resource to prefer its current node.

Additional resources

  • /usr/share/ansible/roles/rhel-system-roles.ha_cluster/README.md file
  • /usr/share/doc/rhel-system-roles/ha_cluster/ directory