The kernel-rt was getting hung up with lots of blocked task messages being logged in kernel ring buffer. Initiated an NMI to crash the kernel-rt to capture a vmcore file.

Solution Unverified - Updated -

Issue

  • The kernel-rt was getting hung up with lots of blocked task messages being logged in kernel ring buffer.
[445262.369289] INFO: task umount.nfs:21141 blocked for more than 600 seconds.
[445262.369290] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[445262.369292] umount.nfs      D 0000000000000000     0 21141  21140 0x00000080
[445262.369294]  ffff881318d1fce8 0000000000000046 ffff881318d1c010 ffff881318d1c000
[445262.369295]  ffff881318d1c010 ffff881318d1c000 ffff881318d1ffd8 ffff881318d1c010
[445262.369296]  ffff881318d1c000 ffff881318d1c000 ffff887e172bcd10 ffff8813186099b0
[445262.369296] Call Trace:
[445262.369302]  [<ffffffff81620074>] schedule+0x34/0xa0
[445262.369304]  [<ffffffff8161e68d>] schedule_timeout+0x19d/0x260
[445262.369308]  [<ffffffff8110c246>] ? rcu_start_gp_advanced+0x46/0x50
[445262.369309]  [<ffffffff8110e132>] ? rcu_start_gp+0x42/0x50
[445262.369311]  [<ffffffff816207ef>] wait_for_completion+0xaf/0xf0
[445262.369313]  [<ffffffff812d2b2d>] ? kobject_release+0xd/0x10
[445262.369314]  [<ffffffff8110f5e0>] ? kfree_call_rcu+0x30/0x30
[445262.369316]  [<ffffffff81090a79>] wait_rcu_gp+0x49/0x60
[445262.369316]  [<ffffffff81090a90>] ? wait_rcu_gp+0x60/0x60
[445262.369318]  [<ffffffff811113a9>] synchronize_rcu+0x29/0x40
[445262.369327]  [<ffffffffa07bab46>] nfs_server_remove_lists+0xc6/0x100 [nfs]
[445262.369330]  [<ffffffffa07bad06>] nfs_free_server+0x16/0xa0 [nfs]
[445262.369335]  [<ffffffffa07c4de5>] nfs_kill_super+0x35/0x40 [nfs]
[445262.369338]  [<ffffffff811c1409>] deactivate_locked_super+0x59/0x80
[445262.369340]  [<ffffffff811c1c9a>] deactivate_super+0x4a/0x70
[445262.369342]  [<ffffffff811df6c2>] mntput_no_expire+0xd2/0x130
[445262.369344]  [<ffffffff811e26f4>] SyS_umount+0xc4/0x100
[445262.369347]  [<ffffffff8162a232>] system_call_fastpath+0x16/0x1b
  • Initiated an NMI to crash the kernel-rt to capture a vmcore file.
[3049136.927846] Kernel panic - not syncing: An NMI occurred. Depending on your system the reason for the NMI is logged in any one of the following resources:
                 1. Integrated Management Log (IML)
                 2. OA Syslog
                 3. OA Forward Progress Log
                 4. iLO Event Log
[3049136.927848] CPU: 0 PID: 21933 Comm: sssd_pam Not tainted 3.10.0-514.rt56.219.el6rt.x86_64 #1
[3049136.927849] Hardware name: HP ProLiant BL460c Gen8, BIOS I31 06/01/2015
[3049136.927851]  0000000000000000 ffff883f3f607d28 ffffffff8161e4b6 ffff883f3f607da8
[3049136.927852]  ffffffff8161e1fa 0000000000000010 ffff883f3f607db8 ffff883f3f607d58
[3049136.927853]  0000000000000000 0000000000000000 ffffffffa02212d8 0000000000000000
[3049136.927854] Call Trace:
[3049136.927860]  <NMI>  [<ffffffff8161e4b6>] dump_stack+0x19/0x1b
[3049136.927863]  [<ffffffff8161e1fa>] panic+0xca/0x1e4
[3049136.927876]  [<ffffffff8106867f>] nmi_panic+0x3f/0x40
[3049136.927879]  [<ffffffffa0220958>] hpwdt_pretimeout+0x98/0x110 [hpwdt]
[3049136.927881]  [<ffffffff81623a9e>] nmi_handle+0x7e/0x120
[3049136.927883]  [<ffffffff81623c08>] io_check_error+0x28/0xb0
[3049136.927884]  [<ffffffff81623e03>] default_do_nmi+0x173/0x210
[3049136.927885]  [<ffffffff81623f99>] do_nmi+0xf9/0x170
[3049136.927886]  [<ffffffff81622f56>] end_repeat_nmi+0x1e/0x2e
[3049136.927889]  [<ffffffff811c0e22>] ? prune_super+0x52/0x190
[3049136.927891]  [<ffffffff811c0e22>] ? prune_super+0x52/0x190
[3049136.927892]  [<ffffffff811c0e22>] ? prune_super+0x52/0x190
[3049136.927896]  <<EOE>>  [<ffffffff81168671>] shrink_slab+0xd1/0x3c0
[3049136.927897]  [<ffffffff8116a260>] ? shrink_zone+0xb0/0xd0
[3049136.927899]  [<ffffffff8116a31a>] ? shrink_zones+0x9a/0x140
[3049136.927900]  [<ffffffff8116a615>] do_try_to_free_pages+0x255/0x350
[3049136.927902]  [<ffffffff81166e7b>] ? throttle_direct_reclaim+0x8b/0x270
[3049136.927903]  [<ffffffff8116aa96>] try_to_free_pages+0xf6/0x1e0
[3049136.927905]  [<ffffffff8115d58f>] __alloc_pages_slowpath+0x29f/0x6b0
[3049136.927906]  [<ffffffff8115dccc>] __alloc_pages_nodemask+0x32c/0x340
[3049136.927908]  [<ffffffff810b3712>] ? enqueue_task_fair+0x172/0x460
[3049136.927911]  [<ffffffff811a03f8>] alloc_pages_vma+0xa8/0x170
[3049136.927913]  [<ffffffff81192112>] read_swap_cache_async+0x102/0x170
[3049136.927914]  [<ffffffff810b2559>] ? dequeue_entity+0xf9/0x380
[3049136.927915]  [<ffffffff81192226>] swapin_readahead+0xa6/0xf0
[3049136.927917]  [<ffffffff81180edc>] do_swap_page+0x11c/0x580
[3049136.927919]  [<ffffffff810991d8>] ? hrtimer_try_to_cancel+0x48/0x110
[3049136.927920]  [<ffffffff811814df>] handle_pte_fault+0x19f/0x230
[3049136.927922]  [<ffffffff810a52a3>] ? migrate_enable+0xd3/0x210
[3049136.927923]  [<ffffffff81181671>] __handle_mm_fault+0x101/0x1a0
[3049136.927924]  [<ffffffff811817c0>] handle_mm_fault+0xb0/0x160
[3049136.927926]  [<ffffffff81625fa7>] __do_page_fault+0x1f7/0x4d0
[3049136.927927]  [<ffffffff81626350>] do_page_fault+0x30/0x90
[3049136.927928]  [<ffffffff81622bf2>] page_fault+0x22/0x30

Environment

  • MRG Realtime 2 (kernel-3.10.0-514.rt56.219.el6rt.x86_64)
  • HP ProLiant BL460c Gen8
  • No tainted modules

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content