How can I recover from a gfs2 withdrawal and fix any filesystem corruption that might exist on a RHEL High Availability cluster?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 5, 6, 7, 8, 9 with the Resilient Storage Add On
  • A gfs2 filesystem has withdrawn

Issue

  • We can see a withdrawal message in the /var/log/messages. Please let us know how can we recover the filesystem?
  • My gfs2 file system is reporting invalid metadata blocks. What do I do to fix this and prevent it from occurring again?
  • A node keeps panicking with gfs2_meta_indirect_buffer in the backtrace when I have errors=panic in the mount options for a gfs2 file system.
  • A withdrawal happened on a gfs2 file system. Could this have left corruption in the metadata that needs to be fixed?
  • A withdrawal happened on a gfs2 file system. Could this have been caused by corruption?
kernel: GFS2: fsid=myCluster:myFS.1: fatal: invalid metadata block
kernel: GFS2: fsid=myCluster:myFS.1:   bh = 2119565 (magic number)
kernel: GFS2: fsid=myCluster:myFS.1:   function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 401
kernel: GFS2: fsid=myCluster:myFS.1: about to withdraw this file system
kernel: GFS2: fsid=myCluster:myFS.1: telling LM to unmount
kernel: GFS2: fsid=myCluster:myFS.1: withdrawn
kernel: Pid: 3218, comm: glock_workqueue Tainted: P           ---------------    2.6.32-279.5.2.el6.x86_64 #1
kernel: Call Trace:
kernel: [<ffffffffa0761062>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]

Resolution

A gfs2 withdraw will prevent access to that filesystem and will require a reboot to access the LVM device. Our recommendation is to do the following after rebooting the cluster node that had the withdrawal:

  • Capture the metadata of the gfs2 filesystem (Optional).
  • Then fsck.gfs2 the filesystem (Recommended).

    NOTE: While running fsck.gfs2, please verify that a correct device is specified with fsck command. A manual error in specifying device name e.g. specifying underlying PV name instead of actual LVM volume that has GFS2 filesystem could corrupt the on disk LVM2 metadata. Please see the article 3147241 for more information on the same.

If the cluster node needs to be put back in production quickly then you may not want to do those two items which may take a prolonged outage to complete. In some cases, an fsck.gfs2 might be required if there are filesystem errors that prevent the filesystem from being mounted. At the earliest window running fsck.gfs2 is recommended to fix any filesystem errors.

Check for Withdrawal and Reboot Affected Nodes

All cluster nodes that have any gfs2 filesystems withdrawn must be rebooted before the filesystem can be remounted. Determine if a filesystem has withdrawn on your node by either:

Checking the value of /sys/fs/gfs2/<locktable>/withdraw on all nodes individually. If the value is 1, the filesystem has withdrawn

# cat /sys/fs/gfs2/myCluster:myFS/withdraw
1

Checking if there is a withdraw message in /var/log/messages at the time a problem is/was suspected and look for messages like the following:

$ grep -ie fatal -ie withdraw /var/log/messages
Jan 23 18:21:16 node42 kernel: GFS: fsid=mycluster:improbable_drive.0: fatal: invalid metadata block
Jan 23 18:21:16 node42 kernel: GFS: fsid=mycluster:improbable_drive.0: about to withdraw from the cluster
Jan 23 18:21:16 node42 kernel: GFS: fsid=mycluster:improbable_drive.0: telling LM to withdraw
Jan 23 18:21:16 node42 kernel: GFS: fsid=mycluster:improbable_drive.0: withdrawn

If a gfs2 filesystem has been withdrawn on a node, then that gfs2 filesystem cannot be mounted (or accessed to capture metadata or fix the filesystem) on that node till after the node is rebooted. In order to capture the metadata or check the filesystem the filesystem has to be unmounted on all cluster nodes.

Unmount the gfs2 filesystem to collect metadata and fsck the filesystem

Unmount the file system from all nodes first. If the gfs2 filesystem is not unmounted on all nodes then corruption could occur if fsck.gfs2 is ran on the filesystem or metadata captured would be invalidate. Verify that the gfs2 filesystem is unmounted from all cluster nodes before proceeding.

# umount /export/gfs2
# mount | grep '/export/gfs2'

If you are using pacemaker to manage your gfs2 filesystems then disable the cloned resource. Then verify that no cluster node has the gfs2 filesystem mounted.

# pcs resource disable <name of cloned resource>

Capture diagnostic data before checking filesystem (Optional)

Prior to checking the file system, it may be useful to capture information from the file system so that the cause of the withdrawal may be determined at a later time. Checking the file system may fix whatever issue led to this, or may change the state of the file system such that it would be impossible to determine what caused it. By capturing a dump of the metadata now, it preserves a copy of the information which may help diagnose the problem and prevent it from occurring again.

NOTE: This may take some time, depending on the size and state of the file system. If getting the file system up and running is the priority, this step can be skipped, but it should be noted that this may eliminate the necessary information required to make a proper diagnosis.
NOTE: The saved metadata file created by gfs2_edit may be large, so provide a path that has ample free space. Note that this command does not capture data stored on the file system, but rather just the metadata.

Save the gfs2 filesystem metadata for later analysis. From one node only, execute the following command to save the filesystem metadata:

# gfs2_edit savemeta <device path> <output file path>

If the savemeta fails for any reason, savemetaslow can be used instead (but it will take longer). NOTE: When the metadata is saved it will gzip compress the file and does NOT need to be compressed again with another tool like zip, gzip, or bzip2.

# gfs2_edit savemetaslow <device path> <output file path>.gz

The file generated by this command can be provided to Red Hat Global Support Services along with any other relevant data, to assist in diagnosing the issue.

Check the filesystem

After any nodes that had a filesystem withdraw have been rebooted, the filesystem should be checked with fsck.gfs2 (or gfs2_fsck), to determine if there are any outstanding issues with the metadata that could have resulted from the withdrawal, or was the cause of the withdrawal.

NOTE: Red Hat recommends that the gfs2-utils package be updated prior to running fsck.gfs2, as this will provide the latest version of the fsck.gfs2 program with the latest optimizations and ability to fix the most known issues.
NOTE: Unmount the file system from all nodes first. If the gfs2 filesystem is not unmounted on all nodes then corruption could occur when fsck.gfs2 is ran on the filesystem. Verify that the gfs2 filesystem is unmounted from all cluster nodes before proceeding.
NOTE: This may take some time, depending on the size and state of the file system. If getting the file system up and running is the priority, this step can be skipped, but it should be noted that this may eliminate the necessary information required to make a proper diagnosis.

Run fsck.gfs2 from one node only. The following command includes -y to answer "yes" to all questions, meaning metadata may be changed during this run. Omit this option if the preference is for there to be a prompt, or use "-n" to skip changing anything. The command below will save the output to a file that can be reviewed.

# fsck.gfs2 -v -y <device path> 2>&1 | tee <output file>

If the fsck.gfs2 output indicated that it has not yet modified data on disk then it can be quit via ctrl-c. If the fsck.gfs2 has begun modifying data then it is recommended that it be allowed to finish regardless of how long it may take.

  • The fsck.gfs2 can be interrupted with ctrl+c once, and it will print out the status of the check so far.
  • The fsck.gfs2 can be interrupted and cancelled by issuing ctrl+c twice to the console where it is running. It is not recommended to do quit fsck.gfs2 if it has already began modifying the filesystem.

Remount the filesystem

If time is permitting, rerun fsck.gfs2 on the gfs2 filesystem and verify that no errors are detected. If 0 is returned then no errors were found, and if 1 is returned then errors were detected.

# fsck.gfs2 -v -y <device path>
# echo $?
0

After fsck.gfs2 has completed, the file systems can be remounted and used again. If issues persist, contact Red Hat Global Support Services, ideally with the diagnostic data captured above, for assistance.

Root Cause

When a filesystem error occurs, gfs2 filesystems will withdraw and become unavailable.

  • Filesystem errors can lie dormant in the filesystem for some time before that section of the filesystem is accessed and the withdraw is triggered.
  • To restore the filesystem and allow it to be remounted, all nodes that have had the filesystem withdraw must be rebooted.
  • Failure to reboot the node before the filesystem is remounted could result in further filesystem corruption as stale data may be stored in memory.
  • The filesystem withdraw was implemented as a nicer option than causing a kernel panic, and it allows the cluster administrator to conduct a post-mortem of the issue and reboot the node at a convenient time. Any services on the node that do not use the gfs2 filesystem can continue to run until the reboot.
  • Please see the Global File system 2 Manual for more information about gfs2 filesystem withdraw function.

Diagnostic Steps

If the following symptoms are present on your system, this solution may apply to you:

  • Your gfs2 filesystem becomes unavailable and cannot be accessed.
  • There are GFS2 filesystem withdraw messages in /var/log/messages on one of your cluster nodes:
  • The virtual file /sys/fs/gfs2/<locktable>/withdraw has a value of 1 if the gfs2 filesystem was withdrawn.
# cat /sys/fs/gfs2/rhel5clu\:disk1/withdraw 
1
# for locktable in $(ls /sys/fs/gfs2/); do echo -n "Checking $locktable: "; if [ $(cat /sys/fs/gfs2/$locktable/withdraw) -eq 1 ]; then echo "Withdrawn"; else echo "OK"; fi; done
Checking rhel5clu:disk1: Withdrawn

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments