Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

8.4. Configuring Failover Domains

A failover domain is a named subset of cluster nodes that are eligible to run a cluster service in the event of a node failure. A failover domain can have the following characteristics:
  • Unrestricted — Allows you to specify that a subset of members are preferred, but that a cluster service assigned to this domain can run on any available member.
  • Restricted — Allows you to restrict the members that can run a particular cluster service. If none of the members in a restricted failover domain are available, the cluster service cannot be started (either manually or by the cluster software).
  • Unordered — When a cluster service is assigned to an unordered failover domain, the member on which the cluster service runs is chosen from the available failover domain members with no priority ordering.
  • Ordered — Allows you to specify a preference order among the members of a failover domain. Ordered failover domains select the node with the lowest priority number first. That is, the node in a failover domain with a priority number of "1" specifies the highest priority, and therefore is the most preferred node in a failover domain. After that node, the next preferred node would be the node with the next highest priority number, and so on.
  • Failback — Allows you to specify whether a service in the failover domain should fail back to the node that it was originally running on before that node failed. Configuring this characteristic is useful in circumstances where a node repeatedly fails and is part of an ordered failover domain. In that circumstance, if a node is the preferred node in a failover domain, it is possible for a service to fail over and fail back repeatedly between the preferred node and another node, causing severe impact on performance.

    Note

    The failback characteristic is applicable only if ordered failover is configured.

Note

Changing a failover domain configuration has no effect on currently running services.

Note

Failover domains are not required for operation.
By default, failover domains are unrestricted and unordered.
In a cluster with several members, using a restricted failover domain can minimize the work to set up the cluster to run a cluster service (such as httpd), which requires you to set up the configuration identically on all members that run the cluster service. Instead of setting up the entire cluster to run the cluster service, you can set up only the members in the restricted failover domain that you associate with the cluster service.

Note

To configure a preferred member, you can create an unrestricted failover domain comprising only one cluster member. Doing that causes a cluster service to run on that cluster member primarily (the preferred member), but allows the cluster service to fail over to any of the other members.
To configure a failover domain, use the following procedures:
  1. Open /etc/cluster/cluster.conf at any node in the cluster.
  2. Add the following skeleton section within the rm element for each failover domain to be used:
    
            <failoverdomains>
                <failoverdomain name="" nofailback="" ordered="" restricted="">
                    <failoverdomainnode name="" priority=""/>
                    <failoverdomainnode name="" priority=""/>
                    <failoverdomainnode name="" priority=""/>
                </failoverdomain>
            </failoverdomains>
    

    Note

    The number of failoverdomainnode attributes depends on the number of nodes in the failover domain. The skeleton failoverdomain section in preceding text shows three failoverdomainnode elements (with no node names specified), signifying that there are three nodes in the failover domain.
  3. In the failoverdomain section, provide the values for the elements and attributes. For descriptions of the elements and attributes, see the failoverdomain section of the annotated cluster schema. The annotated cluster schema is available at /usr/share/doc/cman-X.Y.ZZ/cluster_conf.html (for example /usr/share/doc/cman-3.0.12/cluster_conf.html) in any of the cluster nodes. For an example of a failoverdomains section, see Example 8.8, “A Failover Domain Added to cluster.conf.
  4. Update the config_version attribute by incrementing its value (for example, changing from config_version="2" to config_version="3">).
  5. Save /etc/cluster/cluster.conf.
  6. (Optional) Validate the file against the cluster schema (cluster.rng) by running the ccs_config_validate command. For example:
    [root@example-01 ~]# ccs_config_validate 
    Configuration validates
    
  7. Run the cman_tool version -r command to propagate the configuration to the rest of the cluster nodes.
Example 8.8, “A Failover Domain Added to cluster.conf shows an example of a configuration with an ordered, unrestricted failover domain.

Example 8.8. A Failover Domain Added to cluster.conf


<cluster name="mycluster" config_version="3">
   <clusternodes>
     <clusternode name="node-01.example.com" nodeid="1">
         <fence>
            <method name="APC">
              <device name="apc" port="1"/>
             </method>
         </fence>
     </clusternode>
     <clusternode name="node-02.example.com" nodeid="2">
         <fence>
            <method name="APC">
              <device name="apc" port="2"/>
            </method>
         </fence>
     </clusternode>
     <clusternode name="node-03.example.com" nodeid="3">
         <fence>
            <method name="APC">
              <device name="apc" port="3"/>
            </method>
         </fence>
     </clusternode>
   </clusternodes>
   <fencedevices>
         <fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>
   </fencedevices>
   <rm>
       <failoverdomains>
           <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">
               <failoverdomainnode name="node-01.example.com" priority="1"/>
               <failoverdomainnode name="node-02.example.com" priority="2"/>
               <failoverdomainnode name="node-03.example.com" priority="3"/>
           </failoverdomain>
       </failoverdomains>
   </rm>
</cluster>
The failoverdomains section contains a failoverdomain section for each failover domain in the cluster. This example has one failover domain. In the failoverdomain line, the name (name) is specified as example_pri. In addition, it specifies that resources using this domain should fail-back to lower-priority-score nodes when possible (nofailback="0"), that failover is ordered (ordered="1"), and that the failover domain is unrestricted (restricted="0").
Note:The priority value is applicable only if ordered failover is configured.