Capsule sync triggers XFS corruption: metadata I/O error: block 0x ("xfs_trans_read_buf_map") error 117

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 7.6
  • kernel-3.10.0-957
  • Red Hat Satellite 6
  • /var/lib/pulp on XFS filesystem

Issue

  • When a sync action is triggered on a Satellite 6.4 running on RHEL 7.6, sometimes a metadata corruption is reported and the XFS filesystem where /var/lib/pulp is located is shut down.
  • The problem may happen on the Satellite machine or on the External Capsule as well

Resolution

If you're facing this issue, please get in touch with Red Hat Support and refer to bugzilla BZ#1658749.

Until the issue will not be solved, a possible workaround is the downgrade of kernel to version prior RHEL 7.6 (kernel-3.10.0-957).

Diagnostic Steps

After running a full sync of a capsule, you may experience the failure of the task and find the following messages from the kernel in dmesg or the system logs.

XFS (dm-8): Metadata corruption detected at xfs_dir3_data_read_verify+0x5e/0x110 [xfs], xfs_dir3_data block 0x5e070348 
XFS (dm-8): Unmount and run xfs_repair 
XFS (dm-8): First 64 bytes of corrupted metadata buffer: 
ffff91559a395000: 2f 76 61 72 2f 6c 69 62 2f 70 75 6c 70 2f 63 6f  /var/lib/pulp/co
ffff91559a395010: 6e 74 65 6e 74 2f 75 6e 69 74 73 2f 72 70 6d 2f  ntent/units/rpm/
ffff91559a395020: 35 65 2f 64 33 66 35 64 39 66 64 61 61 30 61 65  5e/d3f5d9fdaa0ae
ffff91559a395030: 66 65 30 63 33 63 64 32 37 66 65 32 66 33 35 37  fe0c3cd27fe2f357
XFS (dm-8): metadata I/O error: block 0x5e070348 ("xfs_trans_read_buf_map") error 117 numblks 8 
XFS (dm-8): xfs_do_force_shutdown(0x1) called from line 370 of file fs/xfs/xfs_trans_buf.c. Return address = 0xffffffffc049d28a 
XFS (dm-8): I/O Error Detected. Shutting down filesystem 
XFS (dm-8): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5. 
XFS (dm-8): Please umount the filesystem and rectify the problem(s)
XFS (dm-2): Metadata corruption detected at xfs_inode_buf_verify+0x79/0x100 [xfs], xfs_inode block 0x7129cb70
XFS (dm-2): Unmount and run xfs_repair
XFS (dm-2): First 64 bytes of corrupted metadata buffer:
ffff99241abc8000: a4 b6 e0 a4 a8 e0 a5 85 e0 a4 b2 e0 a4 bf e0 a4  ................
ffff99241abc8010: 9f e0 a5 80 e0 a4 95 e0 a4 b0 e0 a5 80 e0 a4 a4  ................
ffff99241abc8020: e0 a4 be 20 50 65 72 6c 20 e0 a4 b8 e0 a4 82 e0  ... Perl .......
ffff99241abc8030: a4 b5 e0 a4 be e0 a4 a6 2e 3c 2f 64 65 73 63 72  .........</descr
XFS (dm-2): metadata I/O error: block 0x7129cb70 ("xfs_trans_read_buf_map") error 117 numblks 16
XFS (dm-2): xfs_do_force_shutdown(0x1) called from line 370 of file fs/xfs/xfs_trans_buf.c.  Return address = 0xffffffffc05ec28a
XFS (dm-2): I/O Error Detected. Shutting down filesystem
XFS (dm-2): Please umount the filesystem and rectify the problem(s)
XFS (dm-2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.

The corrupt buffers seen should include magic numbers for XFS self describing metadata, for example for
xfs_inode_buf_verify the buffer should begin with IN.

ffff8cd65edb5000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00  IN..............

A short list of verifier names seen for this corruption and their expected magic numbers are:

xfs_inode_buf_verify          'IN'
xfs_dir3_data_read_verify     'XDD3'
xfs_dir3_free_read_verify     'XDF3'

In some cases the buffer appears to contain a path, possibly a symlink target.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments