RHEL 8 perpetual Soft Lockup walking THP-backed shared memory
Issue
- The issue has been seen when using THP-backed Shared memory, without the i915 driver.
- The
i915graphics driver can forcefully requestTransparent Huge Pages (THP)backed shared memory, leading to a perpetual soft lockup of the system.
[94859.997138] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [kworker/1:0:105968]
[94859.997144] Modules linked in: seqiv ip_vti ip_tunnel ah4 esp4 xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp tunnel6 chacha20poly1305 cmac camellia_generic came
llia_aesni_avx2 camellia_aesni_avx_x86_64 camellia_x86_64 ccm xcbc des_generic uinput nft_counter nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables
_set nf_tables libcrc32c nfnetlink sunrpc vfat fat intel_rapl_msr pmt_telemetry wmi_bmof pmt_class i2c_designware_platform i2c_designware_core snd_sof_pci_in
tel_tgl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel soundwire_cadence snd_sof_intel_hda_mlink intel_rapl_common snd_sof_intel_hda snd_sof_pci x
86_pkg_temp_thermal snd_hda_codec_hdmi intel_powerclamp snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_hda_ext_core coretemp snd_hda_codec_realtek snd_soc_acpi
_intel_match kvm_intel snd_soc_acpi snd_hda_codec_generic soundwire_generic_allocation ledtrig_audio soundwire_bus snd_soc_core kvm snd_compress irqbypass sn
d_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec
[94859.997193] snd_hda_core snd_hwdep snd_seq snd_seq_device intel_cstate snd_pcm joydev intel_uncore wdat_wdt pcspkr snd_timer snd soundcore i2c_i801 mei_m
e idma64 intel_lpss_pci intel_vsec intel_lpss mei wmi serial_multi_instantiate intel_pmc_core acpi_pad acpi_tad binfmt_misc ext4 mbcache jbd2 dm_crypt sd_mod
t10_pi sg i915 i2c_algo_bit cec drm_buddy intel_gtt drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt ttm ahci crct10dif_pclmul crc32_pclm
ul libahci crc32c_intel drm libata ghash_clmulni_intel r8169 igc realtek video hid_multitouch dm_mirror dm_region_hash dm_log dm_mod
[94859.997234] CPU: 1 PID: 105968 Comm: kworker/1:0 Kdump: loaded Tainted: G U -------- - - 4.18.0-553.37.1.el8_10.x86_64 #1
[94859.997244] Hardware name: Draeger Infinity CentralStation Gen5 CPU/K3931-Nx, BIOS V5.0.0.26 R1.4.0 for K3931-Nxx 08/14/2024
[94859.997247] Workqueue: events delayed_fput
[94859.997251] RIP: 0010:xas_load+0x53/0x80
[94859.997255] Code: 41 38 48 10 77 ed 49 8b 50 08 48 d3 ea 83 e2 3f 89 d0 48 8d 44 c6 28 48 8b 00 49 89 70 18 48 89 c1 83 e1 03 48 83 f9 02 75 18 <48> 3d fd
00 00 00 77 10 48 c1 e8 02 89 c2 89 c0 48 8d 44 c6 28 48
[94859.997256] RSP: 0018:ffffa871815abb30 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[94859.997258] RAX: ffff9be1de0c3232 RBX: 000000000000000f RCX: 0000000000000002
[94859.997259] RDX: 0000000000000000 RSI: ffff9be3efa61240 RDI: ffffa871815abb48
[94859.997260] RBP: 0000000000000001 R08: ffffa871815abb48 R09: ffffa871815abb48
[94859.997261] R10: ffffffffffffffff R11: 0000000000000001 R12: ffffffffffffffff
[94859.997262] R13: ffffa871815abc78 R14: ffffa871815abbf8 R15: 0000000000000000
[94859.997263] FS: 0000000000000000(0000) GS:ffff9be3f7c40000(0000) knlGS:0000000000000000
[94859.997264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[94859.997265] CR2: 00007f09f61741a0 CR3: 0000000263010000 CR4: 0000000000750ee0
[94859.997266] PKRU: 55555554
[94859.997267] Call Trace:
[94859.997269] <IRQ>
[94859.997271] ? watchdog_timer_fn.cold.10+0x46/0x9e
[94859.997273] ? watchdog+0x30/0x30
[94859.997275] ? __hrtimer_run_queues+0x101/0x280
[94859.997278] ? hrtimer_interrupt+0x100/0x220
[94859.997279] ? sched_clock+0x5/0x10
[94859.997281] ? smp_apic_timer_interrupt+0x6a/0x130
[94859.997283] ? apic_timer_interrupt+0xf/0x20
[94859.997284] </IRQ>
[94859.997285] ? xas_load+0x53/0x80
[94859.997286] xas_find+0x183/0x1c0
[94859.997288] find_get_entries+0x219/0x2d0
[94859.997291] shmem_undo_range+0xec/0x8c0
[94859.997294] ? current_time+0x4a/0x90
[94859.997296] shmem_truncate_range+0x14/0x40
[94859.997298] shmem_evict_inode+0xe7/0x240
[94859.997299] ? var_wake_function+0x30/0x30
[94859.997302] evict+0xd2/0x1a0
[94859.997303] __dentry_kill+0xd5/0x170
[94859.997305] dentry_kill+0x4d/0x1a0
[94859.997306] dput.part.33+0xff/0x150
[94859.997308] __fput+0x10b/0x250
[94859.997310] delayed_fput+0x1c/0x30
[94859.997311] process_one_work+0x1d3/0x390
[94859.997314] ? process_one_work+0x390/0x390
[94859.997315] worker_thread+0x30/0x390
[94859.997317] ? process_one_work+0x390/0x390
[94859.997319] kthread+0x134/0x150
[94859.997321] ? set_kthread_struct+0x50/0x50
[94859.997322] ret_from_fork+0x1f/0x40
[94859.997326] Kernel panic - not syncing: softlockup: hung tasks
[94859.997327] CPU: 1 PID: 105968 Comm: kworker/1:0 Kdump: loaded Tainted: G U L -------- - - 4.18.0-553.37.1.el8_10.x86_64 #1
[94859.997329] Hardware name: Draeger Infinity CentralStation Gen5 CPU/K3931-Nx, BIOS V5.0.0.26 R1.4.0 for K3931-Nxx 08/14/2024
[94859.997330] Workqueue: events delayed_fput
[94859.997332] Call Trace:
[94859.997333] <IRQ>
[94859.997334] dump_stack+0x41/0x60
[94859.997336] panic+0xe7/0x2ac
[94859.997338] ? syscall_return_via_sysret+0x6e/0x94
[94859.997340] watchdog_timer_fn.cold.10+0x85/0x9e
[94859.997341] ? watchdog+0x30/0x30
[94859.997343] __hrtimer_run_queues+0x101/0x280
[94859.997345] hrtimer_interrupt+0x100/0x220
[94859.997347] ? sched_clock+0x5/0x10
[94859.997349] smp_apic_timer_interrupt+0x6a/0x130
[94859.997350] apic_timer_interrupt+0xf/0x20
[94859.997352] </IRQ>
[94859.997352] RIP: 0010:xas_load+0x53/0x80
[94859.997354] Code: 41 38 48 10 77 ed 49 8b 50 08 48 d3 ea 83 e2 3f 89 d0 48 8d 44 c6 28 48 8b 00 49 89 70 18 48 89 c1 83 e1 03 48 83 f9 02 75 18 <48> 3d fd 00 00 00 77 10 48 c1 e8 02 89 c2 89 c0 48 8d 44 c6 28 48
[94859.997355] RSP: 0018:ffffa871815abb30 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[94859.997357] RAX: ffff9be1de0c3232 RBX: 000000000000000f RCX: 0000000000000002
[94859.997358] RDX: 0000000000000000 RSI: ffff9be3efa61240 RDI: ffffa871815abb48
[94859.997359] RBP: 0000000000000001 R08: ffffa871815abb48 R09: ffffa871815abb48
[94859.997360] R10: ffffffffffffffff R11: 0000000000000001 R12: ffffffffffffffff
[94859.997361] R13: ffffa871815abc78 R14: ffffa871815abbf8 R15: 0000000000000000
[94859.997362] xas_find+0x183/0x1c0
[94859.997364] find_get_entries+0x219/0x2d0
[94859.997366] shmem_undo_range+0xec/0x8c0
[94859.997368] ? current_time+0x4a/0x90
[94859.997370] shmem_truncate_range+0x14/0x40
[94859.997372] shmem_evict_inode+0xe7/0x240
[94859.997373] ? var_wake_function+0x30/0x30
[94859.997375] evict+0xd2/0x1a0
[94859.997376] __dentry_kill+0xd5/0x170
[94859.997378] dentry_kill+0x4d/0x1a0
[94859.997379] dput.part.33+0xff/0x150
[94859.997381] __fput+0x10b/0x250
[94859.997382] delayed_fput+0x1c/0x30
[94859.997384] process_one_work+0x1d3/0x390
[94859.997386] ? process_one_work+0x390/0x390
[94859.997388] worker_thread+0x30/0x390
[94859.997389] ? process_one_work+0x390/0x390
[94859.997391] kthread+0x134/0x150
[94859.997393] ? set_kthread_struct+0x50/0x50
[94859.997395] ret_from_fork+0x1f/0x40
Environment
- Red Hat Enterprise Linux 8.10
- Seen both with and without the
i915Graphics Driver
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.