Deadlock happens between posix_cpu_timer_del() and run_posix_cpu_timers()

Solution Unverified - Updated -

Issue

  • Deadlock happens between posix_cpu_timer_del() and run_posix_cpu_timers()
  • rcu_sched CPU stall and hard lockup messages followed by the crash:
[2280423.826700] INFO: rcu_sched detected stalls on CPUs/tasks: { 16 17 25 27 28 33 42} (detected by 21, t=60010 jiffies, g=550083904, c=550083903, q=846)
[2280423.842272] Task dump for CPU 16:
[2280423.842274] topgun-server.t R  running task        0 1794334 2066249 0x00000088
[2280423.842277] Call Trace:
[2280423.845254]  [<ffffffffafd8c060>] ? __schedule+0x320/0x680
[2280423.851700]  [<ffffffffaf6d7726>] __cond_resched+0x26/0x30
[2280423.858148]  [<ffffffffaf867d09>] ? dput+0x29/0x1a0
[2280423.863899]  [<ffffffffaf8cd1c4>] ? proc_flush_task+0x174/0x1b0
[2280423.870836]  [<ffffffffafd80d82>] ? queued_spin_lock_slowpath+0xb/0xf
[2280423.878374]  [<ffffffffaf717d0b>] ? queued_write_lock_slowpath+0x8b/0x90
[2280423.886206]  [<ffffffffafd8eedd>] ? _raw_qwrite_lock_irq+0x2d/0x40
[2280423.893444]  [<ffffffffaf697745>] ? tasklist_write_lock_irq+0x15/0x20
[2280423.900978]  [<ffffffffaf6a0074>] ? release_task+0x44/0x490
[2280423.907517]  [<ffffffffaf6a1c7e>] ? do_exit+0x5fe/0xa30
[2280423.913663]  [<ffffffffaf71358f>] ? futex_wait+0x11f/0x280
[2280423.920097]  [<ffffffffaf6a212f>] ? do_group_exit+0x3f/0xa0
[2280423.926639]  [<ffffffffaf6b328e>] ? get_signal_to_deliver+0x1ce/0x5e0
[2280423.934173]  [<ffffffffaf62c527>] ? do_signal+0x57/0x6f0
[2280423.940415]  [<ffffffffaf7152a6>] ? do_futex+0x106/0x4d0
[2280423.946662]  [<ffffffffaf62cc32>] ? do_notify_resume+0x72/0xc0
[2280423.953461]  [<ffffffffafd9a2ef>] ? int_signal+0x12/0x17
    ...
[2280433.674174] NMI watchdog: Watchdog detected hard LOCKUP on cpu 15
[2280433.681123] Modules linked in:
[2280433.692495]  scsi_transport_iscsi bonding sunrpc vfat fat dm_queue_length dm_multipath intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel i2c_algo_bit ttm kvm drm_kms_helper syscopyarea sysfillrect sysimgblt irqbypass fb_sys_fops crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul ses drm enclosure raid1 glue_helper pcspkr ablk_helper ipmi_si cryptd hpilo joydev sg ipmi_devintf drm_panel_orientation_quirks hpwdt lpc_ich ipmi_msghandler mei_me mei tpm_crb wmi acpi_power_meter onload(OE) binfmt_misc sfc_char(OE) sfc_resource(OE) sfc_affinity(OE) ip_tables xfs libcrc32c dm_snapshot dm_bufio sd_mod lpfc sfc(OE) nvmet_fc nvmet mdio crc_t10dif ptp crct10dif_generic crc32c_intel pps_core crct10dif_pclmul nvme_fc mtd smartpqi nvme_fabrics nvme nvme_core scsi_transport_sas scsi_transport_fc scsi_tgt
[2280433.824683]  crct10dif_common dm_mirror dm_region_hash dm_log dm_mod
[2280433.840271] CPU: 15 PID: 1794105 Comm: topgun-server.t Kdump: loaded Tainted: G           OE  ------------ T 3.10.0-1160.59.1.el7.x86_64 #1
[2280433.875243] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/28/2021
[2280433.895785] task: ffff933680a4d280 ti: ffff93410d0d0000 task.ti: ffff93410d0d0000
[2280433.915139] RIP: 0010:[<ffffffffaf717aa2>]  [<ffffffffaf717aa2>] native_queued_spin_lock_slowpath+0x122/0x200
[2280433.937759] RSP: 0018:ffff93410d0d3c20  EFLAGS: 00000046
[2280433.955057] RAX: 0000000000000000 RBX: ffffffffb0207080 RCX: 0000000000790000
[2280433.974419] RDX: ffff936e3fadb8c0 RSI: 0000000000190001 RDI: ffffffffb0207084
[2280433.994089] RBP: ffff93410d0d3c20 R08: ffff93af3f45b8c0 R09: 0000000000000000
[2280434.013812] R10: 0000000000000001 R11: ffff93ae167e7f00 R12: ffffffffb0207084
[2280434.033302] R13: ffff933680a4d280 R14: ffff933680a4d280 R15: ffff93410d0d3858
[2280434.052801] FS:  00007f11f75fe700(0000) GS:ffff93af3f440000(0000) knlGS:0000000000000000
[2280434.073810] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2280434.093236] CR2: 00007fb220093ff8 CR3: 0000006205c10000 CR4: 00000000007607e0
[2280434.113541] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[2280434.133873] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[2280434.153909] PKRU: 55555554
[2280434.168814] Call Trace:
[2280434.183316]  [<ffffffffafd80d82>] queued_spin_lock_slowpath+0xb/0xf
[2280434.202247]  [<ffffffffaf717d0b>] queued_write_lock_slowpath+0x8b/0x90
[2280434.221511]  [<ffffffffafd8eedd>] _raw_qwrite_lock_irq+0x2d/0x40
[2280434.240558]  [<ffffffffaf697745>] tasklist_write_lock_irq+0x15/0x20
[2280434.259854]  [<ffffffffaf6a0074>] release_task+0x44/0x490
[2280434.277920]  [<ffffffffaf6a1c7e>] do_exit+0x5fe/0xa30
[2280434.295414]  [<ffffffffaf71358f>] ? futex_wait+0x11f/0x280
[2280434.313412]  [<ffffffffaf6594b2>] ? native_smp_send_reschedule+0x52/0x70
[2280434.333275]  [<ffffffffaf6a212f>] do_group_exit+0x3f/0xa0
[2280434.351298]  [<ffffffffaf6b328e>] get_signal_to_deliver+0x1ce/0x5e0
[2280434.370347]  [<ffffffffaf62c527>] do_signal+0x57/0x6f0
[2280434.388293]  [<ffffffffaf7152a6>] ? do_futex+0x106/0x4d0
[2280434.406302]  [<ffffffffaf6b21b5>] ? do_send_specific+0x75/0xa0
[2280434.424902]  [<ffffffffaf6b227f>] ? do_tkill+0x9f/0xd0
[2280434.443045]  [<ffffffffaf62cc32>] do_notify_resume+0x72/0xc0
[2280434.463626]  [<ffffffffafd9a2ef>] int_signal+0x12/0x17
[2280434.481800] Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 c0 b8 01 00 48 03 14 c5 60 18 35 b0 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b 
[2280434.528135] Kernel panic - not syncing: Hard LOCKUP

Environment

  • Red Hat Enterprise Linux 7.9.z - kernel-3.10.0-1160.59.1.el7

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content