RHEL 7crash in complete_cmd_fusion() when processing interrupt for megaraid_sas
Issue
- System crashes usually during shutdown with kernel messages:
[ 670.733681] megaraid_sas 0000:04:00.0: Command timedout from megasas_flush_cache
[ 673.667798] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 673.667843] IP: [<ffffffffa0056eb8>] complete_cmd_fusion+0x1f8/0x480 [megaraid_sas]
[ 673.667882] PGD 0
[ 673.667904] Oops: 0000 [#1] SMP
[...]
[ 673.668724] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G OE ------------ 3.10.0-327.41.3.el7.x86_64 #1
[ 673.668754] Hardware name: Huawei 5288 V3/BC11HGSE0, BIOS 3.63 05/19/2017
[ 673.668778] task: ffff881851190b80 ti: ffff881029750000 task.ti: ffff881029750000
[ 673.668805] RIP: 0010:[<ffffffffa0056eb8>] [<ffffffffa0056eb8>] complete_cmd_fusion+0x1f8/0x480 [megaraid_sas]
[ 673.669013] RSP: 0018:ffff88103ff83e18 EFLAGS: 00010046
[ 673.669203] RAX: 0000000000000000 RBX: ffff8800350002c0 RCX: 0000000000000000
[ 673.669387] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 673.669578] RBP: ffff88103ff83e80 R08: ffff881023900000 R09: 0000000000000000
[ 673.669772] R10: ffff881023808740 R11: 0000000000000000 R12: ffff881023893400
[ 673.669967] R13: 0000000000000000 R14: ffff8800350b9a00 R15: ffff881023900000
[ 673.670167] FS: 0000000000000000(0000) GS:ffff88103ff80000(0000) knlGS:0000000000000000
[ 673.670544] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 673.670736] CR2: 0000000000000000 CR3: 0000001027536000 CR4: 00000000003407e0
[ 673.670932] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 673.671130] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 673.671328] Stack:
[ 673.671508] ffff88103ff83e38 00000000810b5ffd 0000000000000000 0000000000000000
[ 673.671902] ffff88103ff83e80 ffffffff810b8796 ffff881023808740 0000881000000000
[ 673.672291] ffff881023808740 0000000000000060 ffff88184dbb9d00 0000000000000000
[ 673.672676] Call Trace:
[ 673.672858] <IRQ>
[ 673.672869]
[ 673.673063] [<ffffffff810b8796>] ? try_to_wake_up+0x1b6/0x300
[ 673.673252] [<ffffffffa00571fc>] megasas_isr_fusion+0x3c/0x190 [megaraid_sas]
[ 673.673576] [<ffffffff8111c77e>] handle_irq_event_percpu+0x3e/0x1e0
[ 673.673742] [<ffffffff8111c95d>] handle_irq_event+0x3d/0x60
[ 673.673908] [<ffffffff8111f5f7>] handle_edge_irq+0x77/0x130
[ 673.674075] [<ffffffff81016ecf>] handle_irq+0xbf/0x150
[ 673.674240] [<ffffffff810e15ba>] ? tick_check_idle+0x8a/0xd0
[ 673.674406] [<ffffffff816426ba>] ? atomic_notifier_call_chain+0x1a/0x20
[ 673.674574] [<ffffffff8164912f>] do_IRQ+0x4f/0xf0
[ 673.674739] [<ffffffff8163e46d>] common_interrupt+0x6d/0x6d
[ 673.674901] <EOI>
- The kernel panic stack trace:
crash> bt
PID: 0 TASK: ffff881851190b80 CPU: 6 COMMAND: "swapper/6"
#0 [ffff88103ff83ad8] machine_kexec at ffffffff81051e9b
#1 [ffff88103ff83b38] crash_kexec at ffffffff810f27d2
#2 [ffff88103ff83c08] oops_end at ffffffff8163f588
#3 [ffff88103ff83c30] no_context at ffffffff8162f6b1
#4 [ffff88103ff83c80] __bad_area_nosemaphore at ffffffff8162f747
#5 [ffff88103ff83cc8] bad_area_nosemaphore at ffffffff8162f8b1
#6 [ffff88103ff83cd8] __do_page_fault at ffffffff8164230e
#7 [ffff88103ff83d38] do_page_fault at ffffffff816424a3
#8 [ffff88103ff83d60] page_fault at ffffffff8163e788
[exception RIP: complete_cmd_fusion+0x1f8]
RIP: ffffffffa0056eb8 RSP: ffff88103ff83e18 RFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8800350002c0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88103ff83e80 R8: ffff881023900000 R9: 0000000000000000
R10: ffff881023808740 R11: 0000000000000000 R12: ffff881023893400
R13: 0000000000000000 R14: ffff8800350b9a00 R15: ffff881023900000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff88103ff83e40] try_to_wake_up at ffffffff810b8796
#10 [ffff88103ff83e88] megasas_isr_fusion at ffffffffa00571fc [megaraid_sas]
#11 [ffff88103ff83eb0] handle_irq_event_percpu at ffffffff8111c77e
#12 [ffff88103ff83ef8] handle_irq_event at ffffffff8111c95d
#13 [ffff88103ff83f20] handle_edge_irq at ffffffff8111f5f7
#14 [ffff88103ff83f40] handle_irq at ffffffff81016ecf
#15 [ffff88103ff83f78] do_IRQ at ffffffff8164912f
--- <IRQ stack> ---
#16 [ffff881029753de8] ret_from_intr at ffffffff8163e46d
[exception RIP: native_safe_halt+0x6]
RIP: ffffffff81058e96 RSP: ffff881029753e98 RFLAGS: 00000286
RAX: 00000000ffffffed RBX: ffff88103ff8cf00 RCX: 0100000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
RBP: ffff881029753e98 R8: 0000000000000000 R9: 00000000000007da
R10: 0000000000000000 R11: 0000000000000000 R12: 000000a0505c65c0
R13: ffff88103ff8fbc0 R14: f789d8c96c1f8840 R15: 0000000000000086
ORIG_RAX: ffffffffffffff83 CS: 0010 SS: 0018
#17 [ffff881029753ea0] default_idle at ffffffff8101dbff
#18 [ffff881029753ec0] arch_cpu_idle at ffffffff8101e506
#19 [ffff881029753ed0] cpu_startup_entry at ffffffff810d64a5
#20 [ffff881029753f28] start_secondary at ffffffff8104768a
Environment
- Red Hat Enterprise Linux 7.2
- Kernel 3.10.0-327.41.3.el7.x86_64
- MegaRAID SAS driver 06.807.10.00-rh1
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.