RHEL5: soft lockups in __invalidate_mapping_pages() on systems with high I/O load when dropping caches

Solution Verified - Updated -

Issue

After running an VFS IO load test on a large memory system (i.e. 768G) and then doing

echo 1 > /proc/sys/vm/drop_caches

several "BUG: soft lockup" messages are seen in the system log:

kernel: BUG: soft lockup - CPU#11 stuck for 60s! [CCTEST-defaultt:2444]
egion_hash dm_log dm_mod dm_mem_cache usb_storage ahci libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
kernel: Pid: 2444, comm: CCTEST-defaultt Tainted: G     ---- 2.6.18-274.7.1.el5 #1
kernel: RIP: 0010:[<ffffffff80064b63>]  [<ffffffff80064b63>] _write_lock_irqsave+0x1/0x12
kernel: RSP: 0018:ffff81099f415cb8  EFLAGS: 00000286
kernel: RAX: 0000000000000065 RBX: ffff81607ee04ac0 RCX: ffff810ffa804f10
kernel: RDX: ffff8187331a9598 RSI: ffff81099f415d00 RDI: ffff81607ee04ad8
kernel: RBP: 000000000568f07e R08: ffff8187331a9550 R09: ffff8187331a9550
kernel: R10: ffff8187331a9550 R11: ffffffff88055ee8 R12: 0000000000000010
kernel: R13: 0000000000000002 R14: ffff81099f415c78 R15: ffff81607ee04ac8
kernel: FS:  00002aceaf6e3fe0(0000) GS:ffff810271d06440(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00000032c9e6baa0 CR3: 0000002aeead2000 CR4: 00000000000006a0
kernel: 
kernel: Call Trace:
kernel:  [<ffffffff80015551>] test_clear_page_dirty+0x7e/0x100
kernel:  [<ffffffff800278de>] try_to_free_buffers+0x75/0xb8
kernel:  [<ffffffff8803331d>] :jbd:journal_try_to_free_buffers+0x15a/0x1c2
kernel:  [<ffffffff800ccd40>] __invalidate_mapping_pages+0x99/0x185
kernel:  [<ffffffff800f8c68>] drop_pagecache+0xa5/0x13b
kernel:  [<ffffffff80097518>] do_proc_dointvec_minmax_conv+0x0/0x56
kernel:  [<ffffffff800f8d18>] drop_caches_sysctl_handler+0x1a/0x2c
kernel:  [<ffffffff80097987>] do_rw_proc+0xcb/0x126
kernel:  [<ffffffff80109503>] proc_reg_write+0x7e/0x99
kernel:  [<ffffffff80016b92>] vfs_write+0xce/0x174
kernel:  [<ffffffff8001745b>] sys_write+0x45/0x6e
kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0

Also other stack traces were observed:

1./2. _write_lock_irqsave / test_clear_page_dirty / :jbd:journal_try_to_free_buffers (see above)
3. kmem_cache_free / free_buffer_head
4. find_get_pages / pagevec_lookup

Environment

  • Red Hat Enterprise Linux (RHEL) 5
  • much memory (i.e. 786GB)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.