5.3. GFS2 File System Hangs and Requires Reboot of All Nodes

If your GFS2 file system hangs and does not return commands run against it, requiring that you reboot all nodes in the cluster before using it, check for the following issues.
  • You may have had a failed fence. GFS2 file systems will freeze to ensure data integrity in the event of a failed fence. Check the messages logs to see if there are any failed fences at the time of the hang. Ensure that fencing is configured correctly.
  • The GFS2 file system may have withdrawn. Check through the messages logs for the word withdraw and check for any messages and calltraces from GFS2 indicating that the file system has been withdrawn. A withdraw is indicative of file system corruption, a storage failure, or a bug. Unmount the file system, update the gfs2-utils package, and execute the fsck command on the file system to return it to service. Open a support ticket with Red Hat Support. Inform them you experienced a GFS2 withdraw and provide sosreports with logs.
    For information on the GFS2 withdraw function, see Section 4.14, “The GFS2 Withdraw Function”.
  • This error may be indicative of a locking problem or bug. Gather data during one of these occurrences and open a support ticket with Red Hat Support, as described in Section 5.2, “GFS2 File System Hangs and Requires Reboot of One Node”.