- Red Hat Enterprise Linux Server 5.8 or 5.9 (with the High Availability Add on)
- Affected kernels are between
2.6.18-348.1.1.el5(not including either of these kernels)
- GFS2 filesystems
GFS2withdraw is occurring similar to the following:
GFS2: fsid=MyCluster:MyGFS.0: function = get_leaf, file = fs/gfs2/dir.c, line = 763
kernel-2.6.18-348.1.1.el5 or later in RHEL 5 Update 9
kernel-2.6.18-371.el5 or later in RHEL 5 Update 10 and above
- Install the newer kernel on all cluster nodes and perform a full cluster restart.
- Update to the latest gfs2-utils package
- fsck the gfs2 filesystem using the steps in How can I recover from a GFS2 withdrawal and fix any filesystem corruption that might exist in a Red Hat Enterprise Linux 5, 6, or 7 Resilient Storage cluster?, however you can skip gathering diagnostic data as this issue is already fully eplained.
- GFS2 filesystems will withdraw when they encounter corruption to prevent additional corruption from occurring.
- More information about this fix is available in the RHEL5.9 release notes:
Previously, GFS2 did not properly free directory hash table memory from cache when the directory was removed from cache. If the same GFS2 inode was later reused as another directory, the stale directory hash table was reused instead of reading the correct information from the media. If the GFS2 hash table was not reused, a small amount of memory was lost until the next reboot. If the hash table was reused, the directory could become corrupt. Later, GFS2 could discover the file system inconsistency and withdraw from the file system, making it unavailable until the system was rebooted. This update applies a patch to the kernel that frees the directory hash table correctly from cache and prevents this file system corruption.
If the following symptoms exist, this solution may apply:
- Filesystem withdraw's do not occur if a kernel including 2.6.18-274.12.1.el5 or earlier is used, or a kernel including 2.6.18-348.1.1.el5 or later is used.
- Review the
/var/log/messagesfile(s) and look for the GFS2 withdraw message similar to:
kernel: GFS2: fsid=rhcluster:app02.2: fatal: invalid metadata block kernel: GFS2: fsid=rhcluster:app02.2: bh = 9120530 (magic number) kernel: GFS2: fsid=rhcluster:app02.2: function = get_leaf, file = fs/gfs2/dir.c, line = 763 kernel: GFS2: fsid=rhcluster:app02.2: about to withdraw this file system kernel: GFS2: fsid=rhcluster:app02.2: telling LM to withdraw kernel: GFS2: fsid=rhcluster:app02.2: withdrawn kernel: kernel: Call Trace: kernel: [<ffffffff8890c764>] :gfs2:gfs2_lm_withdraw+0xd3/0x100 kernel: [<ffffffff80063a2a>] __wait_on_bit+0x60/0x6e kernel: [<ffffffff8001558e>] sync_buffer+0x0/0x3f kernel: [<ffffffff88902d8f>] :gfs2:gfs2_dirent_find+0x0/0x4d kernel: [<ffffffff80063aa4>] out_of_line_wait_on_bit+0x6c/0x78 kernel: [<ffffffff800a34d5>] wake_bit_function+0x0/0x23 kernel: [<ffffffff8001aaeb>] submit_bh+0x10d/0x114 kernel: [<ffffffff88920803>] :gfs2:gfs2_meta_check_ii+0x2c/0x38 kernel: [<ffffffff889023ca>] :gfs2:get_leaf+0x6b/0xa8 kernel: [<ffffffff889029e6>] :gfs2:get_first_leaf+0x2a/0x31 kernel: [<ffffffff88902a70>] :gfs2:gfs2_dirent_search+0x83/0x16e kernel: [<ffffffff8890403a>] :gfs2:gfs2_dir_search+0x21/0x73 kernel: [<ffffffff8000daa3>] permission+0x81/0xc8 kernel: [<ffffffff8890ac80>] :gfs2:gfs2_lookupi+0x12e/0x16b kernel: [<ffffffff8890ac3e>] :gfs2:gfs2_lookupi+0xec/0x16b kernel: [<ffffffff88917a08>] :gfs2:gfs2_lookup+0x26/0xa7 kernel: [<ffffffff889089db>] :gfs2:gfs2_glock_put+0xfd/0x115 kernel: [<ffffffff80022970>] d_alloc+0x176/0x1ab kernel: [<ffffffff8000d09c>] do_lookup+0x126/0x227 kernel: [<ffffffff8000a295>] __link_path_walk+0x9e6/0xf25 kernel: [<ffffffff8000eb23>] link_path_walk+0x45/0xb8 kernel: [<ffffffff8000cdf6>] do_path_lookup+0x294/0x310 kernel: [<ffffffff8002380c>] __path_lookup_intent_open+0x56/0x97 kernel: [<ffffffff8001b0e7>] open_namei+0x73/0x6ba kernel: [<ffffffff800275c8>] do_filp_open+0x1c/0x38 kernel: [<ffffffff80019f9a>] do_sys_open+0x44/0xbe kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
- The filesystem is marked as withdrawn:
# for locktable in $(ls /sys/fs/gfs2/); do echo -n "Checking $locktable: "; if [ $(cat /sys/fs/gfs2/$locktable/withdraw) -eq 1 ]; then echo "Withdrawn"; else echo "OK"; fi; done Checking rhcluster:app01: OK Checking rhcluster:app02: Withdrawn Checking rhcluster:app03: OK
- The issue reoccurs on another filesystem, or on the same filesystem again after
fsck.gfs2has run and fixed any corruption present.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.