Frequent spurious hard lockups on a Dell PowerEdge R740xd with memory modules from a mixture of vendors.

Solution Verified - Updated -

Issue

  • Spurious hard lockups often on a Dell PowerEdge R740xd with memory modules from a mixture of vendors.
[343614.012474] Kernel panic - not syncing: Hard LOCKUP
[343614.012474] CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Tainted: G          I      --------- -  - 4.18.0-305.3.1.el8_4.x86_64 #1
[343614.012475] Hardware name: Dell Inc. PowerEdge R740xd/014X06, BIOS 2.10.0 11/12/2020
[343614.012475] Call Trace:
[343614.012475]  <NMI>
[343614.012475]  dump_stack+0x5c/0x80
[343614.012475]  panic+0xe7/0x2a9
[343614.012476]  ? secondary_startup_64_no_verify+0xbc/0xcb
[343614.012476]  nmi_panic.cold.9+0xc/0xc
[343614.012476]  watchdog_overflow_callback.cold.7+0x5c/0x70
[343614.012476]  __perf_event_overflow+0x52/0xf0
[343614.012476]  handle_pmi_common+0x204/0x2a0
[343614.012477]  ? __set_pte_vaddr+0x32/0x50
[343614.012477]  ? __native_set_fixmap+0x24/0x30
[343614.012477]  ? ghes_copy_tofrom_phys+0xd3/0x1c0
[343614.012477]  intel_pmu_handle_irq+0xbf/0x160
[343614.012477]  perf_event_nmi_handler+0x2d/0x50
[343614.012478]  nmi_handle+0x63/0x110
[343614.012478]  default_do_nmi+0x49/0x100
[343614.012478]  do_nmi+0x17e/0x1e0
[343614.012478]  end_repeat_nmi+0x16/0x6f
[343614.012478] RIP: 0010:sched_clock_cpu+0x1/0xb0
[343614.012479] Code: 89 74 11 10 48 89 ee 89 c7 e8 1b 04 81 00 3b 05 99 ae 70 01 0f 83 1b 02 00 00 eb c1 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 55 <53> 0f 1f 44 00 00 e8 04 5a f1 ff 48 03 05 dd ac 70 01 48 89 c2 48
[343614.012479] RSP: 0018:ffffbd56cccf4fa8 EFLAGS: 00000002
[343614.012479] RAX: 0000000000000001 RBX: ffff9ffa3f4d6630 RCX: 00000000000006e0
[343614.012480] RDX: 000000000008cfa4 RSI: 00000000e9ae3eb4 RDI: 0000000000000007
[343614.012480] RBP: ffff9fcb01ed8000 R08: 0000000000000002 R09: 0000000000029700
[343614.012480] R10: 0008cfa4e9ac8be8 R11: 0000000000000000 R12: 0000000000000000
[343614.012480] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[343614.012481]  ? sched_clock_cpu+0x1/0xb0
[343614.012481]  ? sched_clock_cpu+0x1/0xb0
[343614.012481]  </NMI>
[343614.012481]  <IRQ>
[343614.012481]  irqtime_account_irq+0x32/0xa0
[343614.012482]  irq_exit+0x1b/0x100
[343614.012482]  smp_apic_timer_interrupt+0x74/0x130
[343614.012482]  apic_timer_interrupt+0xf/0x20
[343614.012482]  </IRQ>
[343614.012482] RIP: 0010:cpuidle_enter_state+0xd9/0x3c0
[343614.012483] Code: e8 2c 7d 9f ff 80 7c 24 07 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 97 02 00 00 31 ff e8 de e2 a5 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 23 01 00 00 49 63 d5 48 2b 5c 24 08 48 8d 04 d5 00
[343614.012483] RSP: 0018:ffffbd56cc937e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[343614.012483] RAX: ffff9ffa3f4e9ec0 RBX: 00013883df251b3f RCX: 000000000000001f
[343614.012484] RDX: 00013883df251b3f RSI: 000000003d1879ab RDI: 0000000000000000
[343614.012484] RBP: ffffdd56bf4c0258 R08: 0000000000000002 R09: 0000000000029700
[343614.012484] R10: 0008cfa4e9ac8a54 R11: ffff9ffa3f4e8be4 R12: ffffffffa4d30a40
[343614.012484] R13: 0000000000000002 R14: 0000000000000002 R15: 0000000000000002
[343614.012485]  ? cpuidle_enter_state+0xb4/0x3c0
[343614.012485]  cpuidle_enter+0x2c/0x40
[343614.012485]  do_idle+0x234/0x260
[343614.012485]  cpu_startup_entry+0x6f/0x80
[343614.012485]  start_secondary+0x199/0x1e0
[343614.012486]  secondary_startup_64_no_verify+0xc2/0xcb

Environment

  • Red Hat Enterprise Linux 8.4 kernel-4.18.0-305.3.1.el8_4.x86_64
  • Dell Inc. PowerEdge R740xd
  • The memory modules might be a mixture from different vendors

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content