Oracle RAC eviction following multiple soft lockups in shrink_zone using RHEL 5.3 or earlier
Issue
-
Oracle RAC nodes are being evicted and rebooted following the kernel reporting multiple soft lockups in
/var/log/messages -
Server is crashing after error messages below:
Jul 20 18:40:31 node1 kernel: BUG: soft lockup - CPU#14 stuck for 10s! [bgsagent:23164] Jul 20 18:41:02 node1 kernel: BUG: soft lockup - CPU#6 stuck for 10s! [oraagent.bin:23494] Jul 20 18:41:26 node1 kernel: BUG: soft lockup - CPU#7 stuck for 10s! [perl:17624] Jul 20 18:41:28 node1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! [oracle:7233] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#11 stuck for 10s! [oracle:9884] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#15 stuck for 11s! [oracle:20656] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#7 stuck for 10s! [perl:17624] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#11 stuck for 10s! [orarootagent.bi:24100] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#0 stuck for 10s! [multipathd:10205] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#7 stuck for 14s! [perl:17624] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#3 stuck for 19s! [oracle:7233] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#11 stuck for 12s! [orarootagent.bi:24100] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#15 stuck for 13s! [oracle:20656] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#3 stuck for 11s! [oracle:7233] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#12 stuck for 10s! [multipathd:10205] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#7 stuck for 11s! [perl:17624] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#11 stuck for 15s! [orarootagent.bi:24100] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#15 stuck for 13s! [oracle:20656] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#3 stuck for 13s! [oracle:7233] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#15 stuck for 12s! [oracle:913] Jul 20 18:41:29 node1 kernel: BUG: soft lockup - CPU#11 stuck for 12s! [orarootagent.bi:24100] -
Multiple soft lockups are occurring in
shrink_zoneorshrink_inactive_listAug 6 14:52:33 node1 kernel: BUG: soft lockup - CPU#11 stuck for 10s! [tnslsnr:29345] Aug 6 14:52:33 node1 kernel: CPU 11: Aug 6 14:52:33 node1 kernel: Modules linked in: oracleasm(U) ocfs2(U) ocfs2_dlmfs(U) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs bonding dm_round_robin dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sg pcspkr bnx2 ide_cd serio_raw cdrom hpilo dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod lpfc scsi_transport_fc shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Aug 6 14:52:33 node1 kernel: Pid: 29345, comm: tnslsnr Tainted: G 2.6.18-128.7.1.el5 #1 Aug 6 14:52:33 node1 kernel: RIP: 0010:[<ffffffff800c7a5f>] [<ffffffff800c7a5f>] shrink_inactive_list+0x770/0x7f9 Aug 6 14:52:33 node1 kernel: RSP: 0018:ffff811a01aa9828 EFLAGS: 00000246 Aug 6 14:52:33 node1 kernel: RAX: 000000000000000e RBX: ffff81016f63bc78 RCX: ffff810009064460 Aug 6 14:52:33 node1 kernel: RDX: ffff81016f63bc40 RSI: ffff81000008bf98 RDI: ffff811a01aa98e8 Aug 6 14:52:33 node1 kernel: RBP: 0000000000000000 R08: ffff81000008b600 R09: ffff811a01aa9a78 Aug 6 14:52:33 node1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8120644d9860 Aug 6 14:52:33 node1 kernel: R13: ffff81131e04ceb0 R14: 0000000b8804dd3d R15: ffff811967239990 Aug 6 14:52:33 node1 kernel: FS: 00002ba65c4d7450(0000) GS:ffff810171db1f40(0000) knlGS:00000000f696bb90 Aug 6 14:52:33 node1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Aug 6 14:52:38 node1 kernel: CR2: 0000000015cd78e4 CR3: 00000015412a3000 CR4: 00000000000006e0 Aug 6 14:52:38 node1 kernel: Aug 6 14:52:38 node1 kernel: Call Trace: Aug 6 14:52:38 node1 kernel: [<ffffffff800c7a1d>] shrink_inactive_list+0x72e/0x7f9 Aug 6 14:52:38 node1 kernel: [<ffffffff80046a6e>] try_to_wake_up+0x46f/0x481 Aug 6 14:52:38 node1 kernel: [<ffffffff80012d17>] shrink_zone+0xf6/0x11c Aug 6 14:52:38 node1 kernel: [<ffffffff800c81e5>] try_to_free_pages+0x197/0x2c2 Aug 6 14:52:38 node1 kernel: [<ffffffff8000f270>] __alloc_pages+0x1cb/0x2ce Aug 6 14:52:38 node1 kernel: [<ffffffff8003c04f>] __get_free_pages+0xe/0x71 Aug 6 14:52:38 node1 kernel: [<ffffffff8001e6d8>] __pollwait+0x58/0xe2 Aug 6 14:52:38 node1 kernel: [<ffffffff8002d8b1>] pipe_poll+0x2d/0x90 Aug 6 14:52:38 node1 kernel: [<ffffffff8002f3c6>] do_sys_poll+0x1b8/0x360 Aug 6 14:52:38 node1 kernel: [<ffffffff8001e680>] __pollwait+0x0/0xe2 Aug 6 14:52:38 node1 kernel: [<ffffffff8008a4b4>] default_wake_function+0x0/0xe
Environment
Red Hat Enterprise Linux (RHEL) 5
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
