RHEL6: kswapd panic "exception RIP: clear_inode+248" kernel BUG at fs/inode.c:313
Environment
- Red Hat Enterprise Linux 6
- kernel prior to 2.6.32-220.13.1.el6
- Custom/proprietary fuse filesystem
Issue
- kernel PANIC under heavy load through FUSE file system with occasional ssh operations.
- A core file shows backtrace similar to the following:
Pid: 258, comm: kswapd0 Not tainted 2.6.32-220.7.1.el6.x86_64 #1 HP ProLiant DL380 G7
RIP: 0010:[<ffffffff81190f68>] [<ffffffff81190f68>] clear_inode+0xf8/0x110
RSP: 0018:ffff88060e3c1af0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88069a7cc9c0 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88069a7cc9c0
RBP: ffff88060e3c1b00 R08: 0000000000000000 R09: 000000000000000e
R10: ffff8806e283e2a0 R11: 0000000000000003 R12: ffffffff81fbf380
R13: 0000000000000000 R14: ffff880c0c8274f8 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff880028220000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000000052885a CR3: 0000000664a6d000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 258, threadinfo ffff88060e3c0000, task ffff88060e38f540)
Stack:
ffff88060e3c1b00 ffff88069a7cc9c0 ffff88060e3c1b30 ffffffff811916a6
<0> ffff8806fca78080 ffff88069a7cc9c0 ffff88069a7cc9c0 ffff88060e3c1be0
<0> ffff88060e3c1b50 ffffffff811905c2 ffff88060e3c1b50 ffff8806d2a26b00
Call Trace:
[<ffffffff811916a6>] generic_delete_inode+0x196/0x1d0
[<ffffffff811905c2>] iput+0x62/0x70
[<ffffffff8118d120>] dentry_iput+0x90/0x100
[<ffffffff8118d281>] d_kill+0x31/0x60
[<ffffffff8118d616>] __shrink_dcache_sb+0x366/0x3c0
[<ffffffff8118d799>] shrink_dcache_memory+0x129/0x1e0
[<ffffffff8112995a>] shrink_slab+0x12a/0x1a0
[<ffffffff8112c70d>] balance_pgdat+0x57d/0x7e0
[<ffffffff8112cd20>] ? isolate_pages_global+0x0/0x350
[<ffffffff81133596>] ? set_pgdat_percpu_threshold+0xa6/0xd0
[<ffffffff8112caa6>] kswapd+0x136/0x3b0
[<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8112c970>] ? kswapd+0x0/0x3b0
[<ffffffff81090726>] kthread+0x96/0xa0
[<ffffffff8100c14a>] child_rip+0xa/0x20
[<ffffffff81090690>] ? kthread+0x0/0xa0
[<ffffffff8100c140>] ? child_rip+0x0/0x20
Code: e8 ce 89 fe ff eb d5 0f 1f 40 00 48 83 bb d0 01 00 00 00 74 c7 48 89 df e8 66 e2 01 00 0f b7 83 ae 00 00 00 25 00 f0 00 00 eb aa <0f> 0b eb fe 0f 0b eb fe 0f 0b eb fe 66 66 66 2e 0f 1f 84 00 00
RIP [<ffffffff81190f68>] clear_inode+0xf8/0x110
Resolution
- Fixed in 6.2.z Red Hat Errata RHSA-2012-0481 kernel 2.6.32-220.13.1.el6
- Fixed in 6.3 Red Hat Errata RHSA-2012-0862 kernel 2.6.32-279.el6
- NOTE: A very similar crash in
clear_inode
was reported on later RHEL6 kernels and is discussed in https://access.redhat.com/solutions/435633.
Root Cause
- Known issue fixed by the following upstream commit:
commit 08142579b6ca35883c1ed066a2681de6f6917062
Author: Jan Kara <jack@suse.cz>
Date: Mon Jun 27 16:18:10 2011 -0700
mm: fix assertion mapping->nrpages == 0 in end_writeback()
Under heavy memory and filesystem load, users observe the assertion
mapping->nrpages == 0 in end_writeback() trigger. This can be caused by
page reclaim reclaiming the last page from a mapping in the following
race:
CPU0 CPU1
...
shrink_page_list()
__remove_mapping()
__delete_from_page_cache()
radix_tree_delete()
evict_inode()
truncate_inode_pages()
truncate_inode_pages_range()
pagevec_lookup() - finds nothing
end_writeback()
mapping->nrpages != 0 -> BUG
page->mapping = NULL
mapping->nrpages--
Fix the problem by doing a reliable check of mapping->nrpages under
mapping->tree_lock in end_writeback().
Diagnostic Steps
- Check the oops message, and if the following matches, no further verification is necessary.
- The kernel is one listed in the Environment section
- The Pid contains
kswapd
and RIP symbol isclear_inode
- The first few symbols of the
Call Trace
matches up (generic_delete_inode
,iput
, anddentry_iput
)
Sep 30 08:26:38 node2 Pid: 178, comm: kswapd0 Not tainted 2.6.32-131.21.1.el6.x86_64 #1 ProLiant DL180 G6
Sep 30 08:26:38 node2 RIP: 0010:[<ffffffff8118cb88>]
Sep 30 08:26:38 node2 [<ffffffff8118cb88>] clear_inode+0xf8/0x110
...
Sep 30 08:26:38 Call Trace:
Sep 30 08:26:38 node2 [<ffffffff8118d2c6>] generic_delete_inode+0x196/0x1d0
Sep 30 08:26:38 node2 [<ffffffff8118c1d2>] iput+0x62/0x70
Sep 30 08:26:38 node2 [<ffffffff81188f70>] dentry_iput+0x90/0x100
- If any of the above steps fail, capture a core file, and analyze the core file as follows. Note that 'RDI' contains an inode pointer, and check the i_data.nrpages for a '0' value as follows:
#6 [ffff88062f45fa50] invalid_op at ffffffff81013f5b
[exception RIP: clear_inode+248]
RIP: ffffffff81186d18 RSP: ffff88062f45fb00 RFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff880128e020c0 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff880128e020c0
RBP: ffff88062f45fb10 R8: ffff88062f45f960 R9: 000000000000000e
R10: 0000000000000000 R11: ffff88019fc0fd60 R12: ffffffff81bf5240
R13: 0000000000000000 R14: ffff88017d0fc0f8 R15: 0000000000000001
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff88062f45fb18] generic_delete_inode at ffffffff81187426
crash> struct inode ffff880128e020c0
...
i_data = {
host = 0xffff880128e020c0,
page_tree = {
...
nrpages = 0, <-------------------------- HERE
- NOTE: If the above 'nrpages' value is non-zero and a fairly large value, see https://access.redhat.com/solutions/435633 for a very similar issue.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments