Removing a file (rm) blocking indefinitely on GPFS on RHEL 6

Solution In Progress - Updated -

Environment

  • Red Hat Enterprise Linux 6.7
  • kernel 2.6.32-573.7.1.el6.x86_64
  • GPFS (mmfslinux, mmfs26)

Issue

  • We upgraded our kernel & glibc version, and we've since started to experience an issue where umount.nfs hangs on many of our servers. We can't confirm that the upgrade is the cause.
INFO: task rm:19688 blocked for more than 120 seconds.
      Not tainted 2.6.32-573.7.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm            D ffff88107fc25400     0 19688  16836 0x10000080
 ffff881a8038b9a8 0000000000000086 0000000000000000 0000000000000086
 ffff881a8038b918 ffff8810698c5cc0 00003eef36838680 ffff882066704040
 ffff8800282159c0 00000001041bfc4a ffff881b68979068 ffff881a8038bfd8
Call Trace:
 [<ffffffffa0720a93>] cxiWaitEventWait+0x143/0x230 [mmfslinux]
 [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
 [<ffffffff8105e173>] ? __wake_up+0x53/0x70
 [<ffffffffa0844150>] _ZN6ThCond12internalWaitEP16KernelSynchStatejPv+0xd0/0x220 [mmfs26]
 [<ffffffffa084545e>] ? _ZN6ThCond5kWaitEiPKc+0x13e/0x2d0 [mmfs26]
 [<ffffffffa078a85b>] ? _ZN13KernelMailbox21sendToDaemonWithReplyEv+0x29b/0x340 [mmfs26]
 [<ffffffffa079908a>] ? _ZN11cifsProcess12isRegisteredEv+0x1a/0x90 [mmfs26]
 [<ffffffffa07c0591>] ? _Z10kSFSRemoveP15KernelOperation7FileUIDPcjiS1_P10ext_cred_tjj+0x2d1/0x440 [mmfs26]
 [<ffffffffa07895bf>] ? _Z10RemoveFileP15KernelOperationP13gpfsVfsData_tP10gpfsNode_tS4_PcjiP10ext_cred_tjP9MMFSVInfo+0x12f/0x1e0 [mmfs26]
 [<ffffffffa07edb38>] ? _Z10gpfsRemoveP13gpfsVfsData_tP9cxiNode_tS2_PcjP9MMFSVInfoP10ext_cred_t+0x388/0x570 [mmfs26]
 [<ffffffffa0720730>] ? cxiBlockingMutexRelease+0x70/0x80 [mmfslinux]
 [<ffffffffa07f0e9f>] ? _Z33gpfsIsCifsBypassTraversalCheckingv+0xef/0x110 [mmfs26]
 [<ffffffffa0894225>] ? _Z21GPFSToSystemErrnoFull5Errno+0x15/0x60 [mmfs26]
 [<ffffffffa0740349>] ? gpfs_i_unlink+0x329/0x460 [mmfslinux]
 [<ffffffff810a75bf>] ? up+0x2f/0x50
 [<ffffffff8119ef83>] ? generic_permission+0x23/0xb0
 [<ffffffffa072a15c>] ? gpfs_i_permission_noacl+0x4c/0xe0 [mmfslinux]
 [<ffffffff812327ff>] ? security_inode_permission+0x1f/0x30
 [<ffffffff8119f8f7>] ? inode_permission+0xa7/0x100
 [<ffffffff811a0770>] ? vfs_unlink+0xa0/0xf0
 [<ffffffff8119f98a>] ? lookup_hash+0x3a/0x50
 [<ffffffff811a3863>] ? do_unlinkat+0x163/0x260
 [<ffffffff810e8a57>] ? audit_syscall_entry+0x1d7/0x200
 [<ffffffff810197c8>] ? syscall_trace_enter+0x1d8/0x1e0
 [<ffffffff811a3b92>] ? sys_unlinkat+0x22/0x40
 [<ffffffff8100b2e8>] ? tracesys+0xd9/0xde

Resolution

  • TBD

Diagnostic Steps

  1. Check syslog to see if there are any task blocked warnings.
  2. Collect a vmcore.
  3. Identify the process that is in uninterruptible state the longest. ps -m | grep UN

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.