Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

4.3. Network Topology

A cluster’s network is either complicated, or really complicated. The basic cluster involves several sets of logical network pathways. Some of these share physical interfaces, and some require dedicated physical interfaces and VLANs, depending on the degree of robustness required. This example is based on a topology that Red Hat uses to certify Oracle RAC/GFS, but is also suitable for the HA configuration.

Note

Cluster networks require several VLANs and multiple address assignments across those VLANs. If bonds are going to span VLANs or switches, then it might be required to use ARP to ensure the correct behavior in the event of a link failure.

4.3.1. Public Network

The public network is the pathway used by the application tier to access the database. The failure scenario is the loss of an entire node, so although bonding does provide protection in the event of the public interface failure, this is not as likely. Bonded public interfaces complicate application tier network configuration and failover sequencing. This network is not bonded in our example.
The hostnames of the server nodes are identified by the public address. All other network interfaces are private, but they still may need addresses assigned by network operations.

Note

Oracle Clusterware (CRS) creates it’s own set of Virtual IPs (VIP) on the public interface. This mechanism makes it possible for CRS on another node to provide continued access to the failed node’s specific public address. Bonded public interfaces, in the presence of CRS VIPs, are not recommended. See Oracle SQL*Net Configuration in both the HA and RAC/GFS Chapters.

4.3.2. Red Hat Cluster Suite Network

The Red Hat Cluster Suite network is used by CMAN to monitor and manage the health of the cluster. This network is critical to the proper functioning of the cluster and is the pathway that is bonded most often.

Note

RAC requires GFS clustered file systems, which utilize the Distributed Lock Manager (DLM) to provide access to GFS. The Oracle Global Cache Services (GCS) is often configured to use this pathway as well. There is a risk of overloading this network, but that is very workload dependent. An advanced administrator may also choose to use Infiniband and Reliable Data Sockets (RDS) to implement GCS.
The network is private, and only ever used by cluster members. The dual-ported e1000 NIC is used for the Red Hat Cluster Suite hearbeat service or Oracle RAC Clusterware services.
The file /etc/modprobe.conf contains all four interfaces, and the two ports of the e1000 will be bonded together. The options for bond0 set the bond for failover (not load balance), and the sampling interval is 100ms. Once the file modprobe.conf file is modified, either remove and reload the e1000 kernel module, or the modification will take effect at the next reboot.
alias eth0 tg3
alias eth1 tg3
alias eth2 e1000
alias eth3 e1000
alias bond0 bonding
options bond0 mode=1 miimon=100
The configuration of the bond requires three network-scripts files: One for bond0, and then the corresponding interface files have to be set as well, as shown in the following example.
ifcfg-eth2

# Intel Corporation 82546GB Gigabit Ethernet Controller
DEVICE=eth2
HWADDR=00:04:23:D4:88:BE
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
TYPE=Ethernet
ONBOOT=no

ifcfg-eth3

# Intel Corporation 82546GB Gigabit Ethernet Controller
DEVICE=eth3
HWADDR=00:04:23:D4:88:BF
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
TYPE=Ethernet
ONBOOT=no

ifcfg-bond0

DEVICE=bond0
IPADDR=192.168.2.162
NETMASK=255.255.255.0
NETWORK=192.168.2.0
BROADCAST=192.168.2.255
BOOTPROTO=none
TYPE=Ethernet
ONBOOT=yes

4.3.3. Fencing Network

When Red Hat Cluster Suite has determined that a cluster node must be removed from the active cluster, it will need to fence this node. The methods used in this cluster are both power-managed. The HP iLO BMC has one Ethernet port, which must be configured, and this information must exactly match the fencing clauses in the /etc/cluster.conf file. Most IPMI-based interfaces only have one network interface, which may prove to be a single point of failure for the fencing mechanism. A unique feature of Red Hat Cluster Suite is the ability to nest fence domains to provide an alternative fence method, in case the BMC pathway fails. A switched Power Distribution Unit (PDU) can be configured (and it frequently has only one port). We do not recommend the use of FCP port fencing, nor T.10 SCSI reservations fence agent for mission critical database applications. The address and user/password must also be correct in the /etc/cluster/conf file.

<fencedevices>
  <fencedevice agent="fence_ilo" hostname="192.168.1.7" login="rac" name="jLO7" passwd="jeff99"/>
  <fencedevice agent="fence_ilo" hostname="192.168.1.8" login=”rac” name="jLO8" passwd="jeff99"/>
</fencedevices>

Note

You can test the fencing configuration manually with the fence_node command. Test early and often.

4.3.4. Red Hat Cluster Suite services

There are now enough hardware and software pieces in place that the cluster.conf file can be completed and parts of the cluster can be initialized. Red Hat Cluster Suite consists of a set of services (cman, qdisk, fenced) that ensure cluster integrity. The values below are from the RAC example, with HA values in comments. The timeouts are good starting points for either configuration and comments give the HA equivalent. More details on the RAC example will be provided in Chapter 5, RAC/GFS Cluster Configuration. More details on the HA example will be provided in Chapter 6, Cold Failover Cluster Configuration.


<cluster config_version="2" name="HA585">
  <fence_daemon post_fail_delay="0" post_join_delay="3" />
    <quorumd interval="7" device="/dev/mapper/qdisk" tko="9" votes="1" log_level="5"/>
    <cman deadnode_timeout="30" expected_nodes="7"/>
    <!-- cman deadnode_timeout="30" expected_votes=”3”/ -->
    <!-- totem token=”31000”-->
                 <multicast addr="225.0.0.12"/>
    <clusternodes>
            <clusternode name="rac7-priv" nodeid="1" votes="1">
                     <multicast addr="225.0.0.12" interface="bond0"/>
                     <fence>
                             <method name="1">
                                     <device name="jLO7"/>
                             </method>
                     </fence>
    </clusternode>
    <clusternode name="rac8-priv" nodeid="2" votes="1">
            <multicast addr="225.0.0.12" interface="bond0"/>
                          <fence>
                                  <method name="1">
                                          <device name="jLO8"/>
                                  </method>
                          </fence>
     </clusternode>
 <fencedevices>
                  <fencedevice agent="fence_ilo" hostname="192.168.1.7" login="rac" name="jLO7"
  passwd="jeff123456"/>
                  <fencedevice agent="fence_ilo" hostname="192.168.1.8" login="rac" name="jLO8"
  passwd="jeff123456"/>
 </fencedevices>

The cluster node names rac7-priv and rac8-priv need to be resolved and therefore are included in all nodes' /etc/hosts file:
192.168.1.7          rac7-priv.example.com           rac7-priv
192.168.1.8          rac8-priv.example.com           rac8-priv

Note

When doing initial testing, set the init level to 2 in the /etc/inittab file, to aid node testing. If the configuration is broken and the node reboots back into init 3, the startup will hang, and this impedes debugging. Open a window and tail the /var/log/messages file to track your progress.
The qdiskd service is the first service to start and is responsible for parsing the cluster.conf file. Any errors will appear in the /var/log/messages file and qdiskd will exit. If qdiskd starts up, then cman should be started next.
Assuming no glitches in configuration (consider yourself talented, if the node enters the cluster on first attempt) we can now ensure that the qdisk and cman services will start on boot:
$ sudo chkconfig –level 3 qdiskd on
$ sudo chkconfig –level 3 cman on
At this point, we should shut down all services on this node and repeat the steps in this chapter for our second node. You can copy the multipath.conf and cluster.conf configuration files to the second node to make things easier. Now the configuration process diverges to the point that further configuration is very RAC/GFS or HA specific. For information on configuring a RAC/GFS cluster, continue with Chapter 5, RAC/GFS Cluster Configuration. For information on configuring cold failover HA cluster, continue with Chapter 6, Cold Failover Cluster Configuration.