ext4 file system corrupted with ext4_mb_generate_buddy messages seen in the logs
Environment
- Red Hat Enterprise Linux 5
- Red Hat Enterprise Linux 6
- Red Hat Enterprise Linux 7
- Red Hat Enterprise Linux 8
- ext4 filesystem
Issue
- ext4 file system corrupted with
ext4_mb_generate_buddy
messages like the following seen in/var/log/messages
:
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84158: 7984 blocks in bitmap, 1840 in gd
Resolution
At present, there is no verified resolution for this issue. Some possible causes and resolutions for this issue are:
- Problematic SAN firmware. An upgrade to the firmware may fix the issue.
- Problems caused by storage hardware. Hardware diagnostics are recommended.
- Memory corruption. Hardware diagnostics are recommended.
- On Red Hat Enterprise Linux 6, performing an offline resize after consuming all reserved GDT blocks. Check the following solution for more information and steps to correct the issue: Consumption of reserved GDT blocks during an online resize results in corruption following the offline resize to an ext4 filesystem
Troubleshooting
Whilst the meaning of the ext4_mb_generate_buddy
is understood, the root cause for the block count mismatch is not understood. Often times this was caused by an issue in storage hardware.
When troubleshooting this issue, it would be useful to collect the following data before repairing the filesystem:
- The whole
dmesg
output (not just parts) - The whole
/var/log/messages
file (not just parts) - Diagnostic image of the file system in question, using a command like:
# e2image -r /dev/<device> - | bzip2 > <device>.e2i.bz2
- Workload description (percentage of read/write, sequential/random, directories and files per directory, ways files are interacted with)
The best piece of troubleshooting information would be a set of steps which consistently reproduce this issue. At present, a consistent reproducer remains unknown.
Please contact Red Hat Global Support Services to raise a support case if you are encountering this issue.
Filesystem Repair
Once the above data has been collected, repair the filesystem with fsck
to clean any corruption present:
e2fsck -fp /dev/<device>
Note: Red Hat recommends having a backup of data present in the filesystem before running the filesystem check.
VMWare virtual machines
There is also a known issue where this corruption can occur on VMWare hypervisors using LSI MegaRAID SAS devices to store virtual machine disks. Please consult with your virtualization vendor if you face this issue with a similar configuration.
Root Cause
These messages are due to ext4 detecting a mismatch in the free block count between the ext4 buddy allocator bitmaps and the free block count in the group descriptor. This may be the result of corruption of the bitmaps. In the case of a mismatch, the code will update the group descriptor free block count with the calculated value from the bitmaps to ensure they match going forward. The exact cause of this issue is still under investigation in private Bugzilla, but the cause in remote storage situations appears to be hardware/firmware related in a majority of the cases.
Red Hat Enterprise Linux 5
- Bugzilla RHBZ#789497 was placed to resolve this issue in RHEL 5:
- This BZ is CLOSED as CLOSED INSUFFICIENT_DATA.
- The ext4 buddy messages are not bugs themselves. They are messages arising from corruption with a source outside of the operating system.
- There are no known EXT4 bugs that explain the behavior and without a reproducer, no additional troubleshooting can be done.
Red Hat Enterprise Linux 6
- Bugzilla RHBZ#516580 was placed to resolve this issue in RHEL 6:
- This BZ is CLOSED as CLOSED INSUFFICIENT_DATA.
- The ext4 buddy messages are not bugs themselves. They are messages arising from corruption with a source outside of the operating system.
- There are no known ext4 bugs that explain the behavior and without a reproducer, no additional troubleshooting can be done.
Diagnostic Steps
- Check
/var/log/messages
for the messages similar to the following:
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84158: 7984 blocks in bitmap, 1840 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84157: 2048 blocks in bitmap, 0 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84156: 18432 blocks in bitmap, 0 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84155: 8192 blocks in bitmap, 0 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84154: 26624 blocks in bitmap, 0 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84153: 9677 blocks in bitmap, 1485 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84152: 8192 blocks in bitmap, 0 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84134: 24323 blocks in bitmap, 22275 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84123: 28672 blocks in bitmap, 2047 in gd
kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy: EXT4-fs: group 84122: 31541 blocks in bitmap, 17205 in gd
kernel: EXT4-fs error (device dm-1): ext4_free_inode: bit already cleared for inode 692357486
kernel: EXT4-fs error (device dm-1): mb_free_blocks: double-free of inode 0's block 2756556800(bit 14336 in group 84123)
kernel: EXT4-fs error (device dm-1): mb_free_blocks: double-free of inode 0's block 2756556801(bit 14337 in group 84123)
kernel: EXT4-fs error (device dm-1): mb_free_blocks: double-free of inode 0's block 2756556802(bit 14338 in group 84123)
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments