The message logs show soft lockups with a RIP in __kernel_poison_pages

Solution Verified - Updated -

Environment

  • Seen in, but not limited to, Red Hat Enterprise Linux 8

Issue

  • The message logs contain watchdog: BUG: soft lockup and the RIP is in __kernel_poison_pages.

Resolution

A Potential Workaround

  • Disabling THP may help mitigate the issue in cases where the soft lockup occurs due to poison_page being invoked 512 times for freeing an order-9 allocation, which is typically a THP page.
RBX: 0000000000000009 <<----- order
R13: ffffd3012e500000 <<----- page
R14: 0000000000000200 <<----- numpages

Root Cause

  • At the time of writing the root cause was not known. If the issue is seen contact Red Hat support for more information.

  • A ‘softlockup’ is defined as a bug that causes the kernel to loop in kernel mode for more than 20 seconds without giving other tasks a chance to run.

  • In this case, the instruction pointer is always at __kernel_poison_pages+0x4f at the time the soft lockup occurs:
$ cat var/log/messages | awk '/soft lockup/,/Call Trace/{if($6~/RIP:/){print $NF}}' | sort | uniq -c
     59 0010:__kernel_poison_pages+0x4f/0x90

 33 void __kernel_poison_pages(struct page *page, int n)
 34 {
 35         int i;
 36 
 37         for (i = 0; i < n; i++)
 38                 poison_page(page + i); <<----- poison_page() is being called inside the loop
 39 }

/usr/src/debug/kernel-4.18.0-553.50.1.el8_10/linux-4.18.0-553.50.1.el8_10.x86_64/mm/page_poison.c: 28
0xffffffffbbf2ba6b <__kernel_poison_pages+0x4b>:    lea    0x8(%rcx),%rdi
0xffffffffbbf2ba6f <__kernel_poison_pages+0x4f>:    mov    %rax,(%rcx) <<-----
0xffffffffbbf2ba72 <__kernel_poison_pages+0x52>:    and    $0xfffffffffffffff8,%rdi
0xffffffffbbf2ba76 <__kernel_poison_pages+0x56>:    mov    %rax,0xff8(%rcx)
0xffffffffbbf2ba7d <__kernel_poison_pages+0x5d>:    sub    %rdi,%rcx
0xffffffffbbf2ba80 <__kernel_poison_pages+0x60>:    add    $0x1000,%ecx
0xffffffffbbf2ba86 <__kernel_poison_pages+0x66>:    shr    $0x3,%ecx
0xffffffffbbf2ba89 <__kernel_poison_pages+0x69>:    rep stos %rax,%es:(%rdi)

 22 static void poison_page(struct page *page)
 23 {
 24         void *addr = kmap_atomic(page);
 25 
 26         /* KASAN still think the page is in-use, so skip it. */
 27         kasan_disable_current();
 28         memset(kasan_reset_tag(addr), PAGE_POISON, PAGE_SIZE); <<----- Here at the time the soft lockup occurs
 29         kasan_enable_current();
 30         kunmap_atomic(addr);
 31 }
  • The potential root cause is that the for loop in __kernel_poison_pages() executes 512 iterations, with each iteration invoking poison_page() to fill a 4 KiB subpage of a 2 MiB order-9 page with the poison pattern (PAGE_POISON 0xaa). This process can occasionally trigger a soft lockup, particularly when the KVM host is heavily loaded and vCPU threads suffer from scheduling delays.

Diagnostic Steps

  • The messages contain either one or multiple soft lockup notices with a similar call stack:

                             vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
        Jan 15 15:04:38 host kernel: watchdog: BUG: soft lockup - CPU#8 stuck for 23s! [Task1:5133]
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        Jan 15 15:04:38 host kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache dsa_filter(POE) dsa_filter_hook(OE) sunrpc intel_rapl_msr intel_rapl_common intel_pmc_core_pltdrv intel_pmc_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev pcspkr i2c_piix4 virtio_balloon xfs libcrc32c sd_mod sr_mod t10_pi cdrom sg ata_generic qxl drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ata_piix virtio_net net_failover libata virtio_console serio_raw crc32c_intel drm virtio_scsi failover dm_mirror dm_region_hash dm_log dm_mod
        Jan 15 15:04:38 host kernel: Red Hat flags: eBPF/event
        Jan 15 15:04:38 host kernel: CPU: 8 PID: 5133 Comm: Task1 Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-372.16.1.el8_6.x86_64 #1
        Jan 15 15:04:38 host kernel: Hardware name: Red Hat RHEV Hypervisor, BIOS 1.11.0-2.el7 04/01/2014
                                     vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
        Jan 15 15:04:38 host kernel: RIP: 0010:__kernel_poison_pages+0x4f/0x80
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        Jan 15 15:04:38 host kernel: Code: 14 25 40 5c 01 00 83 82 f0 12 00 00 01 4c 89 c1 48 2b 0d e4 5e e9 00 48 c1 f9 06 48 c1 e1 0c 48 03 0d e5 5e e9 00 48 8d 79 08 <48> 89 01 48 83 e7 f8 48 89 81 f8 0f 00 00 48 29 f9 81 c1 00 10 00
        Jan 15 15:04:38 host kernel: RSP: 0018:ffff9b288a3d7a40 EFLAGS: 00010286 ORIG_RAX: ffffffffffffff13
        Jan 15 15:04:38 host kernel: RAX: aaaaaaaaaaaaaaaa RBX: 0000000000000009 RCX: ffff8cf15415b000
        Jan 15 15:04:38 host kernel: RDX: ffff8ce8a07ad280 RSI: ffffd3012e508000 RDI: ffff8cf15415b008
        Jan 15 15:04:38 host kernel: RBP: 0000000000000000 R08: ffffd3012e5056c0 R09: 00000000001021e9
        Jan 15 15:04:38 host kernel: R10: ffffffffffffffff R11: 0000000000c82c00 R12: ffffd30100000000
        Jan 15 15:04:38 host kernel: R13: ffffd3012e500000 R14: 0000000000000200 R15: 0000000000000000
        Jan 15 15:04:38 host kernel: FS:  0000000000000000(0000) GS:ffff8cf9af800000(0000) knlGS:0000000000000000
        Jan 15 15:04:38 host kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        Jan 15 15:04:38 host kernel: CR2: 00007f39da333700 CR3: 0000000d4a210006 CR4: 00000000003706e0
        Jan 15 15:04:38 host kernel: Call Trace:
        Jan 15 15:04:38 host kernel: __free_pages_ok+0x2a5/0x3e0
        Jan 15 15:04:38 host kernel: release_pages+0x160/0x450
        Jan 15 15:04:38 host kernel: tlb_flush_mmu+0x42/0x150
        Jan 15 15:04:38 host kernel: zap_huge_pmd+0x23a/0x340
        Jan 15 15:04:38 host kernel: unmap_page_range+0xbeb/0xdb0
        Jan 15 15:04:38 host kernel: ? cpumask_any_but+0x20/0x40
        Jan 15 15:04:38 host kernel: ? flush_tlb_mm_range+0xb0/0x110
        Jan 15 15:04:38 host kernel: ? free_ldt_pgtables+0x7d/0xa0
        Jan 15 15:04:38 host kernel: unmap_vmas+0xbc/0xe0
        Jan 15 15:04:38 host kernel: exit_mmap+0xaa/0x180
        Jan 15 15:04:38 host kernel: mmput+0x54/0x130
        Jan 15 15:04:38 host kernel: do_exit+0x2fb/0xb10
        Jan 15 15:04:38 host kernel: do_group_exit+0x3a/0xa0
        Jan 15 15:04:38 host kernel: get_signal+0x158/0x860
        Jan 15 15:04:38 host kernel: ? do_futex+0x2f5/0x4d0
        Jan 15 15:04:38 host kernel: do_signal+0x36/0x690
        Jan 15 15:04:38 host kernel: ? __audit_syscall_exit+0x249/0x2a0
        Jan 15 15:04:38 host kernel: exit_to_usermode_loop+0x89/0xf0
        Jan 15 15:04:38 host kernel: do_syscall_64+0x198/0x1a0
        Jan 15 15:04:38 host kernel: entry_SYSCALL_64_after_hwframe+0x65/0xc
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments