Why does Oracle RAC nodes reboot when using dm-multipath and the fabric ports are flapping?
Environment
- Red Hat Enterprise Linux 5
- Red Hat Enterprise Linux 6
- Oracle RAC 2 node cluster
- device-mapper-multipath
Issue
-
2 Oracle RAC cluster nodes crashed at virtually the same time.
-
SAN path failures around the same time of the crash.
Resolution
-
Create an
/etc/multipath.conffile using defaults from "/usr/share/doc/device-mapper-multipath-/multipath.conf.defaults" -
Be sure to set no_path_retry = fail and turn off queue_if_no_path for Oracle Voting disks and Red Hat Cluster Suite quorum disks. See Multipath Configuration for more details and examples.
-
Tune Oracle to not fence before device-mapper-multipath has had the proper amount of time to fail a path and send IO down one of the remaining active paths.
-
If changes are made to
/etc/multipath.conf, reloadmultipathdto load in the changes and view/confirm the changes have been made:$ service multipathd reload $ multipath -llFormula to determine time for Oracle to fence is:
(threshold -1)*2. The threshold is set via theO2CB_HEARTBEAT_THRESHOLDvariable configurable in/etc/sysconfig/o2cb. It is recommended to consult Oracle for their input here.
Root Cause
-
Fabric switch ports were flapping and it caused SAN path failures.
-
device-mapper-multipath was not configured correctly (missing /etc/multipath.conf) and was unable to fail over to the other paths.
-
device-mapper-multipath was configured to queue all I/O's if there were no available paths, which caused Oracle RAC to timeout and believe it couldn't write to the disk anymore (rather than fail immediately when there were no paths available)
-
Oracle RAC rebooted the system because its integrity could not be assured as it could not write to the shared storage any longer.
Diagnostic Steps
1) Open a case with Oracle support to find out if Oracle RAC fenced the node and for precisely which reason.
2) Check /var/log/messages for any messages that indicate why a server might reboot.
3) Check for messages on the other Oracle RAC nodes at the same time that the first server rebooted to see if it was killed by another node.
4) Check cssd.log and crsd.log for any indications that the node was rebooted by Oracle.
5) Look for logs from o2net daemon that indicate it couldn't contact another node:
kernel: o2net: accepted connection from node hostname (num 1) at 192.168.0.2:7777
kernel: o2net: connection to node hostname (num 1) at 192.168.0.2:7777 has been idle for 30.0 seconds, shutting it down.
kernel: (0,0):o2net_idle_timer:1503 here are some times that might help debug the situation: <data omitted>
kernel: o2net: no longer connected to node hostname (num 1) at 192.168.0.2:7777
6) Check if Oracle RAC's cssd daemon is configured to reboot the node with SysRq+B if it needs to reboot:
$ grep FAST_REBOOT etc/rc.d/init.d/init.cssd | grep sysrq
FAST_REBOOT="/sbin/reboot -n -f & $SLEEP 1 ; $ECHO b > /proc/sysrq-trigger"
FAST_REBOOT="$ECHO b > /proc/sysrq-trigger"
FAST_REBOOT="$ECHO b > /proc/sysrq-trigger"
7) Check the configuration that multipathd is using is sane and proper for the current array.
8) Conduct a test where a fibre cable is removed from the server while it is running to see if that path is failed and I/O's continue down another path.
9) Once multipath is configured properly, test/determine multipath failover times.
10) Temporarily disable Oracle software so that the node will not be prematurely fenced.
11) Use a statement like: logger '!!!!! TESTING NOW !!!!!' to mark where in the logs test has started.
12) Immediately after (11), fail one of the paths to the mpath device.
13) Watch /var/log/messages to determine the time it takes for proper path failure
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
