fence_scsi_check.pl watchdog script does a soft reboot instead of hard and hangs during shutdown in a RHEL 6 or 7 Resilient Storage cluster with device-mapper-multipath
Issue
- After manually fencing a node with actively running a resource group, scsi watchdog begins to initiate a reboot but fails to completely reboot the machine.
- When watchdog reboots a node, it gets stuck shutting down. I see backtraces with it waiting on device mapper or the file system
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add On
- Using SCSI Persistent Reservation Fencing (
fence_scsi) - Using the
fence_scsi_check.plwatchdog script forfence_scsito reboot a node when fenced- RHEL 7:
- Using a
fence-agents-scsirelease prior to4.0.11-27.el7_2.5, OR - Using
fence-agents-scsi-4.0.11-27.el7_2.5or later AND/etc/watchdog.d/fence_scsi_checkis in place (as opposed to/etc/watchdog.d/fence_scsi_check_hardreboot)
- Using a
- RHEL 6:
- Using a
fence-agentsrelease prior to3.1.5-48.el6, OR - Using
fence-agents-3.1.5-48.el6or later AND/usr/share/cluster/fence_scsi_check.plis linked or copied to/etc/watchdog.d(as opposed to/usr/share/cluster/fence_scsi_check_hardreboot.plbeing linked or copied)
- Using a
- RHEL 7:
device-mapper-multipath- The settings for the device in question enable queueing (even if only temporary) when all paths have failed
- Can be enabled via
no_path_retryset to "queue" or a value greater than 0 in/etc/multipath.conf, or in the built-in device settings inmultipathd(see/usr/share/doc/device-mapper-multipath-$vers/multipath.conf.defaults) - Can be enabled via
features "1 queue_if_no_path"in/etc/multipath.confor built-in device settings inmultipathdifno_path_retryis not set.
- Can be enabled via
- The settings for the device in question enable queueing (even if only temporary) when all paths have failed
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
