RHEL5: soft lockups in __invalidate_mapping_pages() on systems with high I/O load when dropping caches
Issue
After running an VFS IO load test on a large memory system (i.e. 768G) and then doing
echo 1 > /proc/sys/vm/drop_caches
several "BUG: soft lockup" messages are seen in the system log:
kernel: BUG: soft lockup - CPU#11 stuck for 60s! [CCTEST-defaultt:2444]
egion_hash dm_log dm_mod dm_mem_cache usb_storage ahci libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
kernel: Pid: 2444, comm: CCTEST-defaultt Tainted: G ---- 2.6.18-274.7.1.el5 #1
kernel: RIP: 0010:[<ffffffff80064b63>] [<ffffffff80064b63>] _write_lock_irqsave+0x1/0x12
kernel: RSP: 0018:ffff81099f415cb8 EFLAGS: 00000286
kernel: RAX: 0000000000000065 RBX: ffff81607ee04ac0 RCX: ffff810ffa804f10
kernel: RDX: ffff8187331a9598 RSI: ffff81099f415d00 RDI: ffff81607ee04ad8
kernel: RBP: 000000000568f07e R08: ffff8187331a9550 R09: ffff8187331a9550
kernel: R10: ffff8187331a9550 R11: ffffffff88055ee8 R12: 0000000000000010
kernel: R13: 0000000000000002 R14: ffff81099f415c78 R15: ffff81607ee04ac8
kernel: FS: 00002aceaf6e3fe0(0000) GS:ffff810271d06440(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00000032c9e6baa0 CR3: 0000002aeead2000 CR4: 00000000000006a0
kernel:
kernel: Call Trace:
kernel: [<ffffffff80015551>] test_clear_page_dirty+0x7e/0x100
kernel: [<ffffffff800278de>] try_to_free_buffers+0x75/0xb8
kernel: [<ffffffff8803331d>] :jbd:journal_try_to_free_buffers+0x15a/0x1c2
kernel: [<ffffffff800ccd40>] __invalidate_mapping_pages+0x99/0x185
kernel: [<ffffffff800f8c68>] drop_pagecache+0xa5/0x13b
kernel: [<ffffffff80097518>] do_proc_dointvec_minmax_conv+0x0/0x56
kernel: [<ffffffff800f8d18>] drop_caches_sysctl_handler+0x1a/0x2c
kernel: [<ffffffff80097987>] do_rw_proc+0xcb/0x126
kernel: [<ffffffff80109503>] proc_reg_write+0x7e/0x99
kernel: [<ffffffff80016b92>] vfs_write+0xce/0x174
kernel: [<ffffffff8001745b>] sys_write+0x45/0x6e
kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Also other stack traces were observed:
1./2. _write_lock_irqsave / test_clear_page_dirty / :jbd:journal_try_to_free_buffers (see above)
3. kmem_cache_free / free_buffer_head
4. find_get_pages / pagevec_lookup
Environment
- Red Hat Enterprise Linux (RHEL) 5
- much memory (i.e. 786GB)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.