RHEL7.3: PIDs blocked due to fscache hangs and kworker messages "CacheFiles: Error: Overlong wait for old active object to go away"

Solution In Progress - Updated -

Issue

  • The following CachFiles messages are seen in the messages file over and over: CacheFiles: Error: Overlong wait for old active object to go away
[9165279.370604] CacheFiles: Error: Overlong wait for old active object to go away
[9165279.371138] CacheFiles: object: OBJc8052
[9165279.371753] CacheFiles: objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0]
[9165279.372330] CacheFiles: ops=0 inp=0 exc=0
[9165279.372939] CacheFiles: parent=ffff881ffe500480
[9165279.373567] CacheFiles: cookie=ffff881efb3f6210 [pr=ffff881fddb18000 nd=ffff883ffab06800 fl=22]
[9165279.374149] CacheFiles: key=[12] '030002000000000081498967'
[9165279.374782] CacheFiles: xobject: OBJbeeb6
[9165279.375397] CacheFiles: xobjstate=WAIT_FOR_CLEARANCE fl=30 wbusy=0 ev=0[10]
[9165279.375947] CacheFiles: xops=0 inp=0 exc=0
[9165279.376525] CacheFiles: xparent=ffff881ffe500480
[9165279.377133] CacheFiles: xcookie=ffff883d403c6d68 [pr=ffff881fddb18000 nd=          (null) fl=18]

HPC grid jobs are getting blocked from finishing and keeping new jobs from being scheduled. One task blocked a very long time is stuck in __fscache_wait_on_invalidate

# cat /proc/31723/stack
[<ffffffffa07f7e8e>] __fscache_wait_on_invalidate+0x2e/0x30 [fscache]
[<ffffffffa0881d81>] nfs_invalidate_mapping+0x61/0x100 [nfs]
[<ffffffffa088249a>] __nfs_revalidate_mapping+0xfa/0x280 [nfs]
[<ffffffffa0882a73>] nfs_revalidate_mapping_protected+0x13/0x20 [nfs]
[<ffffffffa087efa4>] nfs_file_read+0x44/0xf0 [nfs]
[<ffffffff811fe0bd>] do_sync_read+0x8d/0xd0
[<ffffffff811fe86e>] vfs_read+0x9e/0x170
[<ffffffff811ff43f>] SyS_read+0x7f/0xe0
[<ffffffff81697709>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Environment

  • Red Hat Enterprise Linux 7.3 (NFS client)
    • kernel-3.10.0-514.26.2.el7
    • cachefilesd-0.10.9-1.el7
  • NFSv3 with fscache enabled
  • autofs used to mount the NFSv3 share with 'fsc'
  • /etc/cachefilesd.conf contains default settings
  • xfs filesystem over linear LVM volume is used for /var/cache/fscache but has some non-default settings (noatime,nodiratime)
/dev/mapper/vg-cachevol /var/cache/fscache xfs rw,seclabel,noatime,nodiratime,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 0 0

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content