Frequent spurious hard lockups on a Dell PowerEdge R740xd with memory modules from a mixture of vendors.
Issue
- Spurious hard lockups often on a Dell PowerEdge R740xd with memory modules from a mixture of vendors.
[343614.012474] Kernel panic - not syncing: Hard LOCKUP
[343614.012474] CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Tainted: G I --------- - - 4.18.0-305.3.1.el8_4.x86_64 #1
[343614.012475] Hardware name: Dell Inc. PowerEdge R740xd/014X06, BIOS 2.10.0 11/12/2020
[343614.012475] Call Trace:
[343614.012475] <NMI>
[343614.012475] dump_stack+0x5c/0x80
[343614.012475] panic+0xe7/0x2a9
[343614.012476] ? secondary_startup_64_no_verify+0xbc/0xcb
[343614.012476] nmi_panic.cold.9+0xc/0xc
[343614.012476] watchdog_overflow_callback.cold.7+0x5c/0x70
[343614.012476] __perf_event_overflow+0x52/0xf0
[343614.012476] handle_pmi_common+0x204/0x2a0
[343614.012477] ? __set_pte_vaddr+0x32/0x50
[343614.012477] ? __native_set_fixmap+0x24/0x30
[343614.012477] ? ghes_copy_tofrom_phys+0xd3/0x1c0
[343614.012477] intel_pmu_handle_irq+0xbf/0x160
[343614.012477] perf_event_nmi_handler+0x2d/0x50
[343614.012478] nmi_handle+0x63/0x110
[343614.012478] default_do_nmi+0x49/0x100
[343614.012478] do_nmi+0x17e/0x1e0
[343614.012478] end_repeat_nmi+0x16/0x6f
[343614.012478] RIP: 0010:sched_clock_cpu+0x1/0xb0
[343614.012479] Code: 89 74 11 10 48 89 ee 89 c7 e8 1b 04 81 00 3b 05 99 ae 70 01 0f 83 1b 02 00 00 eb c1 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 55 <53> 0f 1f 44 00 00 e8 04 5a f1 ff 48 03 05 dd ac 70 01 48 89 c2 48
[343614.012479] RSP: 0018:ffffbd56cccf4fa8 EFLAGS: 00000002
[343614.012479] RAX: 0000000000000001 RBX: ffff9ffa3f4d6630 RCX: 00000000000006e0
[343614.012480] RDX: 000000000008cfa4 RSI: 00000000e9ae3eb4 RDI: 0000000000000007
[343614.012480] RBP: ffff9fcb01ed8000 R08: 0000000000000002 R09: 0000000000029700
[343614.012480] R10: 0008cfa4e9ac8be8 R11: 0000000000000000 R12: 0000000000000000
[343614.012480] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[343614.012481] ? sched_clock_cpu+0x1/0xb0
[343614.012481] ? sched_clock_cpu+0x1/0xb0
[343614.012481] </NMI>
[343614.012481] <IRQ>
[343614.012481] irqtime_account_irq+0x32/0xa0
[343614.012482] irq_exit+0x1b/0x100
[343614.012482] smp_apic_timer_interrupt+0x74/0x130
[343614.012482] apic_timer_interrupt+0xf/0x20
[343614.012482] </IRQ>
[343614.012482] RIP: 0010:cpuidle_enter_state+0xd9/0x3c0
[343614.012483] Code: e8 2c 7d 9f ff 80 7c 24 07 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 97 02 00 00 31 ff e8 de e2 a5 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 23 01 00 00 49 63 d5 48 2b 5c 24 08 48 8d 04 d5 00
[343614.012483] RSP: 0018:ffffbd56cc937e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[343614.012483] RAX: ffff9ffa3f4e9ec0 RBX: 00013883df251b3f RCX: 000000000000001f
[343614.012484] RDX: 00013883df251b3f RSI: 000000003d1879ab RDI: 0000000000000000
[343614.012484] RBP: ffffdd56bf4c0258 R08: 0000000000000002 R09: 0000000000029700
[343614.012484] R10: 0008cfa4e9ac8a54 R11: ffff9ffa3f4e8be4 R12: ffffffffa4d30a40
[343614.012484] R13: 0000000000000002 R14: 0000000000000002 R15: 0000000000000002
[343614.012485] ? cpuidle_enter_state+0xb4/0x3c0
[343614.012485] cpuidle_enter+0x2c/0x40
[343614.012485] do_idle+0x234/0x260
[343614.012485] cpu_startup_entry+0x6f/0x80
[343614.012485] start_secondary+0x199/0x1e0
[343614.012486] secondary_startup_64_no_verify+0xc2/0xcb
Environment
- Red Hat Enterprise Linux 8.4
kernel-4.18.0-305.3.1.el8_4.x86_64 - Dell Inc. PowerEdge R740xd
- The memory modules might be a mixture from different vendors
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.