CPU runqueue lock spinlock deadlock caues a hard lockup and a crash. a 3rd party module bug is suspected.

Solution Unverified - Updated -

Issue

  • The kernel crashed due to a hard lockup.
[3244541.601489] NMI watchdog: Watchdog detected hard LOCKUP on cpu 51
[3244541.601529] Modules linked in:
[3244541.602787]  event_collector_2_0_324607(OE) nfsv3 nfs_acl nfs lockd grace fscache vfat fat amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd raid1 sg joydev pcspkr ccp k10temp i2c_piix4 ipmi_si ipmi_devintf ipmi_msghandler pinctrl_amd i2c_designware_platform i2c_designware_core acpi_cpufreq binfmt_misc auth_rpcgss sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic sr_mod cdrom ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe ttm ahci mdio ptp libahci drm crct10dif_pclmul pps_core crct10dif_common libata crc32c_intel serio_raw nvme dca nvme_core drm_panel_orientation_quirks nfit libnvdimm uas usb_storage dm_mirror dm_region_hash dm_log dm_mod
[3244541.610848] CPU: 51 PID: 0 Comm: swapper/51 Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-957.21.3.el7.x86_64 #1
[3244541.613115] Hardware name: HPE CL3150 G4/CL3150 G4 PCA, BIOS O51 04/03/2018
[3244541.614282] task: ffff8e2943aa30c0 ti: ffff8e2943ab8000 task.ti: ffff8e2943ab8000
[3244541.615449] RIP: 0010:[<ffffffffab9aef04>]  [<ffffffffab9aef04>] cpuidle_enter_state+0x54/0xd0
[3244541.616609] RSP: 0018:ffff8e2943abbe60  EFLAGS: 00000202
[3244541.617733] RAX: 000b86ceb37522d5 RBX: ffff8e2943abbe38 RCX: 0000000000000018
[3244541.618846] RDX: 0000000225c17d03 RSI: 0000000000000001 RDI: 000b86ceb37522d5
[3244541.619939] RBP: ffff8e2943abbe88 R08: 00000000000002fc R09: 0000000000000018
[3244541.621005] R10: 0000000000000284 R11: 0000000000000000 R12: 0000000000000000
[3244541.622030] R13: 0000002000000020 R14: ffffffffc08a3de0 R15: ffff8e2943abbe58
[3244541.623034] FS:  00007f00f7546740(0000) GS:ffff8e486f8c0000(0000) knlGS:00000000f76e1700
[3244541.624039] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3244541.625026] CR2: 00007f00f6b5a590 CR3: 000000079c610000 CR4: 00000000003407e0
[3244541.626011] Call Trace:
[3244541.626988]  [<ffffffffab9af05e>] cpuidle_idle_call+0xde/0x230
[3244541.627980]  [<ffffffffab4366de>] arch_cpu_idle+0xe/0xc0
[3244541.628962]  [<ffffffffab4fc6da>] cpu_startup_entry+0x14a/0x1e0
[3244541.629946]  [<ffffffffab458047>] start_secondary+0x1f7/0x270
[3244541.630908]  [<ffffffffab4000d5>] start_cpu+0x5/0x14
[3244541.631849] Code: 89 fb e8 00 36 b5 ff 4c 89 e6 48 89 df 44 89 f2 49 89 c5 49 8b 47 48 e8 fb 80 dd ff 41 89 c4 e8 e3 35 b5 ff 48 89 c7 fb 66 66 90 <66> 66 90 4c 29 ef e8 f1 08 af ff 48 69 c0 40 42 0f 00 b9 ff ff 
[3244541.633878] Kernel panic - not syncing: Hard LOCKUP
[3244541.634871] CPU: 51 PID: 0 Comm: swapper/51 Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-957.21.3.el7.x86_64 #1
[3244541.636908] Hardware name: HPE CL3150 G4/CL3150 G4 PCA, BIOS O51 04/03/2018
[3244541.637959] Call Trace:
[3244541.639006]  <NMI>  [<ffffffffabb63107>] dump_stack+0x19/0x1b
[3244541.640081]  [<ffffffffabb5c810>] panic+0xe8/0x21f
[3244541.641141]  [<ffffffffab4974bf>] nmi_panic+0x3f/0x40
[3244541.642185]  [<ffffffffab5497a1>] watchdog_overflow_callback+0x121/0x140
[3244541.643231]  [<ffffffffab5a15e7>] __perf_event_overflow+0x57/0x100
[3244541.644289]  [<ffffffffab5aac54>] perf_event_overflow+0x14/0x20
[3244541.645345]  [<ffffffffab4055f0>] x86_pmu_handle_irq+0x140/0x1a0
[3244541.646406]  [<ffffffffabb6c031>] perf_event_nmi_handler+0x31/0x50
[3244541.647466]  [<ffffffffabb6d91c>] nmi_handle.isra.0+0x8c/0x150
[3244541.648527]  [<ffffffffabb6db3d>] do_nmi+0x15d/0x460
[3244541.649582]  [<ffffffffabb6cd89>] end_repeat_nmi+0x1e/0x81
[3244541.650633]  [<ffffffffab512486>] ? native_queued_spin_lock_slowpath+0x156/0x200
[3244541.651693]  [<ffffffffab512486>] ? native_queued_spin_lock_slowpath+0x156/0x200
[3244541.652738]  [<ffffffffab512486>] ? native_queued_spin_lock_slowpath+0x156/0x200
[3244541.653747]  <EOE>  <IRQ>  [<ffffffffabb5d28b>] queued_spin_lock_slowpath+0xb/0xf
[3244541.654749]  [<ffffffffabb6b760>] _raw_spin_lock+0x20/0x30
[3244541.655724]  [<ffffffffab4d4f43>] scheduler_tick+0x43/0x150
[3244541.656671]  [<ffffffffab50b080>] ? tick_sched_do_timer+0x50/0x50
[3244541.657578]  [<ffffffffab4ab2c5>] update_process_times+0x65/0x80
[3244541.658460]  [<ffffffffab50adf0>] tick_sched_handle+0x30/0x70
[3244541.659317]  [<ffffffffab50b0b9>] tick_sched_timer+0x39/0x80
[3244541.660151]  [<ffffffffab4c6103>] __hrtimer_run_queues+0xf3/0x270
[3244541.660964]  [<ffffffffab4c668f>] hrtimer_interrupt+0xaf/0x1d0
[3244541.661763]  [<ffffffffab45a58b>] local_apic_timer_interrupt+0x3b/0x60
[3244541.662559]  [<ffffffffabb7a6e3>] smp_apic_timer_interrupt+0x43/0x60
[3244541.663343]  [<ffffffffabb76df2>] apic_timer_interrupt+0x162/0x170
[3244541.664115]  <EOI>  [<ffffffffab9aef04>] ? cpuidle_enter_state+0x54/0xd0
[3244541.664892]  [<ffffffffab9af05e>] cpuidle_idle_call+0xde/0x230
[3244541.665665]  [<ffffffffab4366de>] arch_cpu_idle+0xe/0xc0
[3244541.666434]  [<ffffffffab4fc6da>] cpu_startup_entry+0x14a/0x1e0
[3244541.667206]  [<ffffffffab458047>] start_secondary+0x1f7/0x270
[3244541.667974]  [<ffffffffab4000d5>] start_cpu+0x5/0x14

Environment

  • Red Hat Enterprise Linux 7.6 (kernel-3.10.0-957.21.3.el7)
  • Out-of-tree module named "event_collector_2_0_324607" is installed and loaded.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content