The stop operation failed for a Filesystem resource managing a GFS2 filesystem in a Pacemaker cluster

Solution Unverified - Updated -

Issue

  • A cluster node was fenced after a GFS2 Filesystem resource failed to stop.
  • The stop operation for a GFS2 filesystem failed after saying that "no processes were signaled" and that it was "giving up".
Jan 24 18:11:30 node1 logger[3605]: INFO: Running stop for /dev/mapper/clvg-lvcldhub on /apps/gfs/risk/datahub
Jan 24 18:11:30 node1 Filesystem(clusterfs)[3624]: INFO: Trying to unmount /apps/gfs/risk/datahub
Jan 24 18:11:30 node1 logger[3637]: ERROR: Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with TERM
Jan 24 18:11:31 node1 Filesystem(clusterfs)[3650]: INFO: sending signal TERM to: appuser   2328 51698 45 18:10 ?        Sl     0:22 pmdtm -PR ...
Jan 24 18:11:31 node1 Filesystem(clusterfs)[3660]: INFO: sending signal TERM to: root      2641     1  0 Jan05 ?        Ssl    9:44 /opt/syslog-ng/libexec/syslog-ng -F --enable-core
Jan 24 18:11:31 node1 logger[3670]: INFO: sending signal TERM to: appuser  51698 50153 ...
Jan 24 18:11:32 node1 Filesystem(clusterfs)[3685]: ERROR: Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with TERM
Jan 24 18:11:32 node1 logger[3697]: INFO: sending signal TERM to: appuser  51698 50153 ...
Jan 24 18:11:33 node1 Filesystem(clusterfs)[3734]: ERROR: Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with TERM
Jan 24 18:11:33 node1 Filesystem(clusterfs)[3745]: INFO: sending signal TERM to: appuser  51698 50153 ...
Jan 24 18:11:34 node1 Filesystem(clusterfs)[3765]: ERROR: Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with KILL
Jan 24 18:11:35 node1 Filesystem(clusterfs)[3776]: INFO: sending signal KILL to: appuser  51698 50153 ...
Jan 24 18:11:36 node1 Filesystem(clusterfs)[3883]: ERROR: Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with KILL
Jan 24 18:11:36 node1 Filesystem(clusterfs)[4192]: INFO: No processes on /apps/gfs/risk/datahub were signalled. force_unmount is set to 'yes'
Jan 24 18:11:37 node1 Filesystem(clusterfs)[4706]: ERROR: Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with KILL
Jan 24 18:11:37 node1 Filesystem(clusterfs)[4715]: INFO: No processes on /apps/gfs/risk/datahub were signalled. force_unmount is set to 'yes'
Jan 24 18:11:37 node1 kernel: GFS2: fsid=PROD_DH_DQ_CLUSTER:dbhub.1: recover generation 15 done
Jan 24 18:11:38 node1 crmd[3935]:  warning: Action 92 (clusterfs_stop_0) on node1 failed (target: 0 vs. rc: 1): Error
...
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ umount: /apps/gfs/risk/datahub: target is busy. ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [         (In some cases useful info about processes that use ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [          the device is found by lsof(8) or fuser(1)) ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ ocf-exit-reason:Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with TERM ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ umount: /apps/gfs/risk/datahub: target is busy. ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [         (In some cases useful info about processes that use ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [          the device is found by lsof(8) or fuser(1)) ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ ocf-exit-reason:Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with TERM ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ umount: /apps/gfs/risk/datahub: target is busy. ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [         (In some cases useful info about processes that use ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [          the device is found by lsof(8) or fuser(1)) ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ ocf-exit-reason:Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with TERM ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ umount: /apps/gfs/risk/datahub: target is busy. ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [         (In some cases useful info about processes that use ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [          the device is found by lsof(8) or fuser(1)) ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ ocf-exit-reason:Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with KILL ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ umount: /apps/gfs/risk/datahub: target is busy. ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [         (In some cases useful info about processes that use ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [          the device is found by lsof(8) or fuser(1)) ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ ocf-exit-reason:Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with KILL ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ umount: /apps/gfs/risk/datahub: target is busy. ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [         (In some cases useful info about processes that use ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [          the device is found by lsof(8) or fuser(1)) ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ ocf-exit-reason:Couldn't unmount /apps/gfs/risk/datahub; trying cleanup with KILL ]
Jan 24 18:11:38 node1 lrmd[3932]:   notice: clusterfs_stop_0:3560:stderr [ ocf-exit-reason:Couldn't unmount /apps/gfs/risk/datahub, giving up! ]
Jan 24 18:11:38 node1 crmd[3935]:   notice: Result of stop operation for clusterfs on node1: 1 (unknown error)

Environment

  • Red Hat Enterprise Linux 6, 7, or 8 (with the Resilient Storage Add-on)
  • Pacemaker
  • A GFS2 filesystem managed by an ocf:heartbeat:Filesystem resource

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In