fence_scsi_check.pl watchdog script does a soft reboot instead of hard and hangs during shutdown in a RHEL 6 or 7 Resilient Storage cluster with GFS2
Issue
- scsi fencing watchdog does not hard reset a node
- The reboot that watchdog does is "soft" so things can hang during the shutdown
- When a node is fenced by
fence_scsi, it seems to get stuck on the way down and never reboots. There are hung task warnings on the console showing processes blocked waiting on GFS2.
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the Resilient Storage Add On
- GFS2
- Using SCSI Persistent Reservation Fencing (
fence_scsi) - Using the
fence_scsi_check.plwatchdog script forfence_scsito reboot a node when fenced- RHEL 7:
- Using a
fence-agents-scsirelease prior to4.0.11-27.el7_2.5, OR - Using
fence-agents-scsi-4.0.11-27.el7_2.5or later AND/etc/watchdog.d/fence_scsi_checkis in place (as opposed to/etc/watchdog.d/fence_scsi_check_hardreboot)
- Using a
- RHEL 6:
- Using a
fence-agentsrelease prior to3.1.5-48.el6, OR - Using
fence-agents-3.1.5-48.el6or later AND/usr/share/cluster/fence_scsi_check.plis linked or copied to/etc/watchdog.d(as opposed to/usr/share/cluster/fence_scsi_check_hardreboot.plbeing linked or copied)
- Using a
- RHEL 7:
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
