RHEL 8.4: kernel crashed due to list_del corruption with LIST_POISON2

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 8.4
    • kernel-4.18.0-305.el8 / kernel-4.18.0-305.3.1.el8_4
    • vmware VM / s390x / Microsoft Corporation Virtual Machine/KVM Guest

Issue

  • The issue started to happen kernel upgrade from kernel-4.18.0-240.22.1.el8 to kernel-4.18.0-305.el8.
  • kernel crashed with following logs:

    [570928.662632] list_del corruption, ffff8b3c3b76b048->prev is LIST_POISON2 (dead000000000200)
    [570928.662739] ------------[ cut here ]------------
    [570928.662740] kernel BUG at lib/list_debug.c:50!
    [570928.662773] invalid opcode: 0000 [#1] SMP PTI
    [570928.662790] CPU: 2 PID: 756280 Comm: kworker/2:0 Kdump: loaded Not tainted 4.18.0-305.el8.x86_64 #1
    [570928.662818] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
    [570928.662853] Workqueue: cgroup_destroy css_release_work_fn
    [570928.662874] RIP: 0010:__list_del_entry_valid.cold.1+0x45/0x4c
    [570928.662894] Code: e8 8a a5 cb ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 40 66 10 95 e8 76 a5 cb ff 0f 0b 48 89 fe 48 c7 c7 08 66 10 95 e8 65 a5 cb ff <0f> 0b 90 90 90 90 90 41 55 41 54 55 53 48 85 d2 74 5f 48 85 f6 74
    [570928.662950] RSP: 0018:ffffa22203613e68 EFLAGS: 00010246
    [570928.662969] RAX: 000000000000004e RBX: ffff8b3c3b76b090 RCX: 0000000000000000
    [570928.662992] RDX: 0000000000000000 RSI: ffff8b3f33d167c8 RDI: ffff8b3f33d167c8
    [570928.663014] RBP: ffffffff95826040 R08: 00000000000005b7 R09: 0000000000aaaaaa
    [570928.663037] R10: 0000000000000000 R11: ffffa22202dff200 R12: ffff8b3c3b76b000
    [570928.663059] R13: ffff8b3f2c0b0000 R14: ffff8b3dfe60d240 R15: ffff8b3c3b76b098
    [570928.663082] FS:  0000000000000000(0000) GS:ffff8b3f33d00000(0000) knlGS:0000000000000000
    [570928.663107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [570928.663126] CR2: 00007f2a5bddc500 CR3: 00000001e1a10005 CR4: 00000000003706e0
    [570928.663184] Call Trace:
    [570928.663204]  css_release_work_fn+0x3f/0x240
    [570928.663254]  process_one_work+0x1a7/0x360
    [570928.663276]  worker_thread+0x30/0x390
    [570928.663291]  ? create_worker+0x1a0/0x1a0
    [570928.663305]  kthread+0x116/0x130
    [570928.663326]  ? kthread_flush_work_fn+0x10/0x10
    [570928.663344]  ret_from_fork+0x35/0x40
    
  • Another pattern of logs:

    [1202369.537819] list_del corruption. next->prev should be ffff94c97f3f2098, but was ffff94c49aa94ad8
    [1202369.538099] ------------[ cut here ]------------
    [1202369.538221] kernel BUG at lib/list_debug.c:56!
    [1202369.538418] invalid opcode: 0000 [#1] SMP PTI
    [1202369.538539] CPU: 5 PID: 883812 Comm: kworker/5:1 Kdump: loaded Not tainted 4.18.0-305.3.1.el8_4.x86_64 #1
    [1202369.538689] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
    [1202369.538873] Workqueue: 0x0 (cgroup_destroy)
    [1202369.539025] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
    [1202369.539178] Code: 65 10 93 e8 dc a4 cb ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 18 66 10 93 e8 c8 a4 cb ff 0f 0b 48 c7 c7 c8 66 10 93 e8 ba a4 cb ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 88 66 10 93 e8 a6 a4 cb ff 0f 0b
    [1202369.539488] RSP: 0018:ffffb32bd004fe60 EFLAGS: 00010046
    [1202369.539622] RAX: 0000000000000054 RBX: ffff94c97f3f2098 RCX: 0000000000000000
    [1202369.539769] RDX: 0000000000000000 RSI: ffff94c9a5f567c8 RDI: ffff94c9a5f567c8
    [1202369.539912] RBP: ffff94c97f3f2090 R08: 00000000000006d8 R09: 0000000000aaaaaa
    [1202369.540052] R10: 0000000000000000 R11: ffffb32bc4f40200 R12: ffff94c4ca311090
    [1202369.540195] R13: ffff94c9a5f697e0 R14: ffffffff920fea30 R15: 0000000000000000
    [1202369.540347] FS: 0000000000000000(0000) GS:ffff94c9a5f40000(0000) knlGS:0000000000000000
    [1202369.540493] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [1202369.540635] CR2: 00000000000000b0 CR3: 0000000408810003 CR4: 00000000001706e0
    [1202369.540805] Call Trace:
    [1202369.540942] move_linked_works+0x49/0xa0
    [1202369.541071] ? create_worker+0x1a0/0x1a0
    [1202369.541190] pwq_activate_delayed_work+0x3e/0xb0
    [1202369.541327] pwq_dec_nr_in_flight+0x5d/0x90
    [1202369.541459] worker_thread+0x30/0x390
    [1202369.541573] ? create_worker+0x1a0/0x1a0
    [1202369.541724] kthread+0x116/0x130
    [1202369.541849] ? kthread_flush_work_fn+0x10/0x10
    [1202369.541995] ret_from_fork+0x35/0x40
    

Resolution

  • Update the kernel according to the versioning below;
    • Red Hat Enterprise Linux 8.4 update the kernel to at least kernel-4.18.0-305.12.1.el8_4 via errata RHSA-2021:3057 or later
    • Red Hat Enterprise Linux 8.5 and above update the kernel to at least kernel-4.18.0-348.el8 via Errata RHSA-2021:4356 or greater.
  • A possible workaround to avoid the issue could also be setting "cgroup_disable=memory" in grub but we are afraid that we cannot warrant that it certainly works to avoid the issue completely.

Root Cause

  • The patch from the following commit works to avoid the bug and resolve the issue.

    9f38f03ae8d5 mm: memcontrol: slab: fix obtain a reference to a freeing memcg
    
  • The bug can only be hit with rhel8.4 kernel versions that are newer than or equal to 4.18.0-305.el8 but are older than 4.18.0-305.12.1.el8_4 because it's caused by another patch from upstream commit 3de7d4f25a74 that was introduced to 4.18.0-305.el8 and onwards.

    3de7d4f25a74 mm: memcg/slab: optimize objcg stock draining
    
  • To avoid any confusions, in other words, the bug can only be hit with the following 8.4/8.4.z versions:

    • kernel-4.18.0-305.el8
    • kernel-4.18.0-305.3.1.el8_4
    • kernel-4.18.0-305.7.1.el8_4
    • kernel-4.18.0-305.10.2.el8_4
      • and the fix is contained in kernel-4.18.0-305.12.1.el8_4 and onwards.
      • the bug is, therefore, no longer hit if you are on 8.4.z - kernel-4.18.0-305.12.1.el8_4 or newer, or, 8.5 or newer (of course 8.6 and 8.7 included)
      • Please refer this article if you need to know which kernel versions are associated with RHEL8.x minor versions.

Diagnostic Steps

vmcore analysis:

      KERNEL: /cores/retrace/repos/kernel/x86_64/usr/lib/debug/lib/modules/4.18.0-305.el8.x86_64/vmlinux
    DUMPFILE: /cores/retrace/tasks/140115919/crash/vmcore  [PARTIAL DUMP]
        CPUS: 4
        DATE: Wed Jun  2 15:57:36 GMT 2021
      UPTIME: 6 days, 14:36:17
LOAD AVERAGE: 0.00, 0.00, 0.00
       TASKS: 374
     RELEASE: 4.18.0-305.el8.x86_64
     VERSION: #1 SMP Thu Apr 29 08:54:30 EDT 2021
     MACHINE: x86_64  (2399 Mhz)
      MEMORY: 12 GB
       PANIC: "kernel BUG at lib/list_debug.c:50!"

        DMI_BIOS_VENDOR: Phoenix Technologies LTD
       DMI_BIOS_VERSION: 6.00
          DMI_BIOS_DATE: 12/12/2018

  x86_model_id = "Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
  microcode = 0x2006906,

[   13.097059] vmxnet3 0000:13:00.0 lan1: intr type 3, mode 0, 5 vectors allocated
[   13.097753] vmxnet3 0000:13:00.0 lan1: NIC Link is Up 10000 Mbps
[570928.662632] list_del corruption, ffff8b3c3b76b048->prev is LIST_POISON2 (dead000000000200)
[570928.662739] ------------[ cut here ]------------
[570928.662740] kernel BUG at lib/list_debug.c:50!
[570928.662773] invalid opcode: 0000 [#1] SMP PTI
[570928.662790] CPU: 2 PID: 756280 Comm: kworker/2:0 Kdump: loaded Not tainted 4.18.0-305.el8.x86_64 #1
[570928.662818] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[570928.662853] Workqueue: cgroup_destroy css_release_work_fn
[570928.662874] RIP: 0010:__list_del_entry_valid.cold.1+0x45/0x4c
[570928.662894] Code: e8 8a a5 cb ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 40 66 10 95 e8 76 a5 cb ff 0f 0b 48 89 fe 48 c7 c7 08 66 10 95 e8 65 a5 cb ff <0f> 0b 90 90 90 90 90 41 55 41 54 55 53 48 85 d2 74 5f 48 85 f6 74
[570928.662950] RSP: 0018:ffffa22203613e68 EFLAGS: 00010246
[570928.662969] RAX: 000000000000004e RBX: ffff8b3c3b76b090 RCX: 0000000000000000
[570928.662992] RDX: 0000000000000000 RSI: ffff8b3f33d167c8 RDI: ffff8b3f33d167c8
[570928.663014] RBP: ffffffff95826040 R08: 00000000000005b7 R09: 0000000000aaaaaa
[570928.663037] R10: 0000000000000000 R11: ffffa22202dff200 R12: ffff8b3c3b76b000
[570928.663059] R13: ffff8b3f2c0b0000 R14: ffff8b3dfe60d240 R15: ffff8b3c3b76b098
[570928.663082] FS:  0000000000000000(0000) GS:ffff8b3f33d00000(0000) knlGS:0000000000000000
[570928.663107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[570928.663126] CR2: 00007f2a5bddc500 CR3: 00000001e1a10005 CR4: 00000000003706e0
[570928.663184] Call Trace:
[570928.663204]  css_release_work_fn+0x3f/0x240
[570928.663254]  process_one_work+0x1a7/0x360
[570928.663276]  worker_thread+0x30/0x390
[570928.663291]  ? create_worker+0x1a0/0x1a0
[570928.663305]  kthread+0x116/0x130
[570928.663326]  ? kthread_flush_work_fn+0x10/0x10
[570928.663344]  ret_from_fork+0x35/0x40
[570928.663361] Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_tables_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock intel_rapl_msr intel_rapl_common sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr i2c_piix4 vmw_vmci ip_tables xfs libcrc32c sr_mod cdrom ata_generic vmwgfx sd_mod t10_pi sg drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel ata_piix ahci libahci serio_raw libata vmxnet3 vmw_pvscsi dm_mirror dm_region_hash dm_log dm_mod fuse

crash> list_head ffff8b3c3b76b048
struct list_head {
  next = 0xffff8b3efb312058, 
  prev = 0xdead000000000200
}

crash> bt
PID: 756280  TASK: ffff8b3f1fff17c0  CPU: 2   COMMAND: "kworker/2:0"
 #0 [ffffa22203613bf0] machine_kexec at ffffffff9406156e
 #1 [ffffa22203613c48] __crash_kexec at ffffffff9418f99d
 #2 [ffffa22203613d10] crash_kexec at ffffffff9419088d
 #3 [ffffa22203613d28] oops_end at ffffffff9402434d
 #4 [ffffa22203613d48] do_trap at ffffffff94020b13
 #5 [ffffa22203613d90] do_invalid_op at ffffffff94021476
 #6 [ffffa22203613db0] invalid_op at ffffffff94a00d64
    [exception RIP: __list_del_entry_valid.cold.1+69]
    RIP: ffffffff94491209  RSP: ffffa22203613e68  RFLAGS: 00010246
    RAX: 000000000000004e  RBX: ffff8b3c3b76b090  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffff8b3f33d167c8  RDI: ffff8b3f33d167c8
    RBP: ffffffff95826040   R8: 00000000000005b7   R9: 0000000000aaaaaa
    R10: 0000000000000000  R11: ffffa22202dff200  R12: ffff8b3c3b76b000
    R13: ffff8b3f2c0b0000  R14: ffff8b3dfe60d240  R15: ffff8b3c3b76b098
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffa22203613e60] __list_del_entry_valid.cold.1 at ffffffff94491209
 #8 [ffffa22203613e68] css_release_work_fn at ffffffff94196b0f
 #9 [ffffa22203613e98] process_one_work at ffffffff940fe397
#10 [ffffa22203613ed8] worker_thread at ffffffff940fea60
#11 [ffffa22203613f10] kthread at ffffffff94104406
#12 [ffffa22203613f50] ret_from_fork at ffffffff94a00255

crash> dis -rl ffffffff94491209 | tail
0xffffffff944911e9 <__list_del_entry_valid.cold.1+37>:  mov    %rdi,%rsi
0xffffffff944911ec <__list_del_entry_valid.cold.1+40>:  mov    $0xffffffff95106640,%rdi
0xffffffff944911f3 <__list_del_entry_valid.cold.1+47>:  callq  0xffffffff9414b76e <printk>
/usr/src/debug/kernel-4.18.0-305.el8/linux-4.18.0-305.el8.x86_64/lib/list_debug.c: 51
0xffffffff944911f8 <__list_del_entry_valid.cold.1+52>:  ud2    
0xffffffff944911fa <__list_del_entry_valid.cold.1+54>:  mov    %rdi,%rsi
0xffffffff944911fd <__list_del_entry_valid.cold.1+57>:  mov    $0xffffffff95106608,%rdi
0xffffffff94491204 <__list_del_entry_valid.cold.1+64>:  callq  0xffffffff9414b76e <printk>
/usr/src/debug/kernel-4.18.0-305.el8/linux-4.18.0-305.el8.x86_64/lib/list_debug.c: 48
0xffffffff94491209 <__list_del_entry_valid.cold.1+69>:  ud2   

 38 bool __list_del_entry_valid(struct list_head *entry)
 39 {
 40         struct list_head *prev, *next;
 41 
 42         prev = entry->prev;
 43         next = entry->next;
 44 
 45         if (CHECK_DATA_CORRUPTION(next == LIST_POISON1,
 46                         "list_del corruption, %px->next is LIST_POISON1 (%px)\n",
 47                         entry, LIST_POISON1) ||
 48             CHECK_DATA_CORRUPTION(prev == LIST_POISON2,
 49                         "list_del corruption, %px->prev is LIST_POISON2 (%px)\n",
 50                         entry, LIST_POISON2) ||
 51             CHECK_DATA_CORRUPTION(prev->next != entry,
 52                         "list_del corruption. prev->next should be %px, but was %px\n",

crash> dis -rl ffffffff94196b0f | tail
/usr/src/debug/kernel-4.18.0-305.el8/linux-4.18.0-305.el8.x86_64/kernel/cgroup/cgroup.c: 4975
0xffffffff94196aee <css_release_work_fn+30>:    mov    $0xffffffff956b5f20,%rdi
0xffffffff94196af5 <css_release_work_fn+37>:    lea    -0x90(%rbx),%r12
0xffffffff94196afc <css_release_work_fn+44>:    callq  0xffffffff9494a400 <mutex_lock>
/usr/src/debug/kernel-4.18.0-305.el8/linux-4.18.0-305.el8.x86_64/kernel/cgroup/cgroup.c: 4977
0xffffffff94196b01 <css_release_work_fn+49>:    orl    $0x4,-0x14(%rbx)
/usr/src/debug/kernel-4.18.0-305.el8/linux-4.18.0-305.el8.x86_64/./include/linux/list.h: 131
0xffffffff94196b05 <css_release_work_fn+53>:    lea    0x48(%r12),%rdi
0xffffffff94196b0a <css_release_work_fn+58>:    callq  0xffffffff94491150 <__list_del_entry_valid>

4968 static void css_release_work_fn(struct work_struct *work)
4969 {
4970         struct cgroup_subsys_state *css =
4971                 container_of(work, struct cgroup_subsys_state, destroy_work);
4972         struct cgroup_subsys *ss = css->ss;
4973         struct cgroup *cgrp = css->cgroup;
4974 
4975         mutex_lock(&cgroup_mutex);
4976 
4977         css->flags |= CSS_RELEASED;
4978         list_del_rcu(&css->sibling);

crash> work_struct ffff8b3c3b76b090
struct work_struct {
  data = {
    counter = 128
  }, 
  entry = {
    next = 0xffff8b3c3b76b098, 
    prev = 0xffff8b3c3b76b098
  }, 
  func = 0xffffffff94196ad0, 
  rh_reserved1 = 0, 
  rh_reserved2 = 0, 
  rh_reserved3 = 0, 
  rh_reserved4 = 0
}

crash> cgroup_subsys_state.destroy_work -ox
struct cgroup_subsys_state {
   [0x90] struct work_struct destroy_work;
}

crash> px 0xffff8b3c3b76b090-0x90
$4 = 0xffff8b3c3b76b000

/usr/src/debug/kernel-4.18.0-305.el8/linux-4.18.0-305.el8.x86_64/kernel/cgroup/cgroup.c: 4973
    4973    struct cgroup *cgrp = css->cgroup;
0xffffffff94196ae7 <css_release_work_fn+23>:    mov    -0x90(%rdi),%r13

R13: ffff8b3f2c0b0000 // cgroup

crash> cgroup_subsys_state.cgroup 0xffff8b3c3b76b000
  cgroup = 0xffff8b3f2c0b0000

crash> cgroup_subsys_state 0xffff8b3c3b76b000 -x
struct cgroup_subsys_state {
  cgroup = 0xffff8b3f2c0b0000, 
  ss = 0xffffffff95826040, 
  refcnt = {
    count = {
      counter = 0x0
    }, 
    percpu_count_ptr = 0x3, 
    release = 0xffffffff94193db0, 
    confirm_switch = 0x0, 
    force_atomic = 0x0, 
    allow_reinit = 0x0, 
    rcu = {
      next = 0xffff8b3f2c0b0b90, 
      func = 0x0
    }
  }, 
  sibling = {
    next = 0xffff8b3efb312058, 
    prev = 0xdead000000000200 <<
  }, 
  children = {
    next = 0xffff8b3c3b76b058, 
    prev = 0xffff8b3c3b76b058
  }, 
  rstat_css_node = {
    next = 0xffff8b3c3b76b068, 
    prev = 0xffff8b3c3b76b068
  }, 
  id = 0x87, 
  flags = 0x14, 
  serial_nr = 0xc394b, 
  online_cnt = {
    counter = 0x0
  }, 
  destroy_work = {
    data = {
      counter = 0x80
    }, 
    entry = {
      next = 0xffff8b3c3b76b098, 
      prev = 0xffff8b3c3b76b098
    }, 
    func = 0xffffffff94196ad0, 
    rh_reserved1 = 0x0, 
    rh_reserved2 = 0x0, 
    rh_reserved3 = 0x0, 
    rh_reserved4 = 0x0
  }, 
  destroy_rwork = {
    work = {
      data = {
        counter = 0xfffffffe1
      }, 
      entry = {
        next = 0xffff8b3c3b76b0d8, 
        prev = 0xffff8b3c3b76b0d8
      }, 
      func = 0xffffffff9419b410, 
      rh_reserved1 = 0x0, 
      rh_reserved2 = 0x0, 
      rh_reserved3 = 0x0, 
      rh_reserved4 = 0x0
    }, 
    rcu = {
      next = 0xffff8b3d6bd46b10, 
      func = 0xffffffff940fe1c0
    }, 
    wq = 0xffff8b3d06325e00
  }, 
  parent = 0xffff8b3efb312000
}


crash> cgroup_subsys_state.flags -x 0xffff8b3c3b76b000
  flags = 0x14

 50 enum {           
 51         CSS_NO_REF      = (1 << 0), /* no reference counting for this css */
 52         CSS_ONLINE      = (1 << 1), /* between ->css_online() and ->css_offline() */
 53         CSS_RELEASED    = (1 << 2), /* refcnt reached zero, released */

crash> pd (1 << 2)
$5 = 4

crash> pd (0x14 && 0x04)
$6 = 1

it's dead for mapping.

crash> kmem 0xffff8b3c3b76b000
CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
ffff8b3d07c028c0     4096        744       824    103    32k  kmalloc-4k
  SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
  ffffd16c40edda00  ffff8b3c3b768000     0      8          1     7
  FREE / [ALLOCATED]
  [ffff8b3c3b76b000]

      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffd16c40eddac0  3b76b000 dead000000000400        0  0 fffffc0000000

crash> cgroup_subsys_state.cgroup 0xffff8b3c3b76b000
  cgroup = 0xffff8b3f2c0b0000

R13: ffff8b3f2c0b0000

crash> kmem ffff8b3f2c0b0000
CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
ffff8b3d07c028c0     4096        744       824    103    32k  kmalloc-4k
  SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
  ffffd16c4cb02c00  ffff8b3f2c0b0000     0      8          7     1
  FREE / [ALLOCATED]
  [ffff8b3f2c0b0000]

      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffd16c4cb02c00 32c0b0000 ffff8b3d07c028c0 ffff8b3f2c0b1000  1 17ffffc0008100 slab,head

crash> cgroup.kn 0xffff8b3f2c0b0000
  kn = 0xffff8b3dc0f8bc38

crash> kernfs_node.name 0xffff8b3dc0f8bc38
  name = 0xffff8b3f2e498300 "user-runtime-dir@1005.service"

crash> kernfs_node.name 0xffff8b3efb20d330
  name = 0xffff8b3f1f386dc0 "system-user\\x2druntime\\x2ddir.slice"

crash> cgroup_subsys_state.sibling ffff8b3f2c0b0000
  sibling = {
    next = 0xffff8b3efb311058, 
    prev = 0xffff8b3efb311058
  }

crash> cgroup_subsys_state.sibling ffff8b3f2c0b0000 -ox
struct cgroup_subsys_state {
  [ffff8b3f2c0b0048] struct list_head sibling;
}

crash> list -H ffff8b3f2c0b0048
ffff8b3efb311058

crash> kmem -i
                 PAGES        TOTAL      PERCENTAGE
    TOTAL MEM  3020862      11.5 GB         ----
         FREE    60625     236.8 MB    2% of TOTAL MEM
         USED  2960237      11.3 GB   97% of TOTAL MEM
       SHARED    96512       377 MB    3% of TOTAL MEM
      BUFFERS        1         4 KB    0% of TOTAL MEM
       CACHED  1559338       5.9 GB   51% of TOTAL MEM
         SLAB    23532      91.9 MB    0% of TOTAL MEM

   TOTAL HUGE        0            0         ----
    HUGE FREE        0            0    0% of TOTAL HUGE

   TOTAL SWAP   488447       1.9 GB         ----
    SWAP USED   112130       438 MB   22% of TOTAL SWAP
    SWAP FREE   376317       1.4 GB   77% of TOTAL SWAP

 COMMIT LIMIT  1998878       7.6 GB         ----
    COMMITTED  2011970       7.7 GB  100% of TOTAL LIMIT

======================================================================
           [ RSS usage ]          [ Process name ]
======================================================================
         4 GiB (   5087396 KiB)   mysqld
        36 MiB (     37428 KiB)   firewalld
        32 MiB (     33348 KiB)   beremote
        27 MiB (     27900 KiB)   tuned
        23 MiB (     23976 KiB)   polkitd
        18 MiB (     18584 KiB)   snmpd
        17 MiB (     18376 KiB)   NetworkManager
        15 MiB (     15944 KiB)   systemd-journal
        13 MiB (     14208 KiB)   systemd
        12 MiB (     12828 KiB)   vmtoolsd
======================================================================
Total memory usage from user-space = 5.12 GiB

crash> mod -t
no tainted modules

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

17 Comments

Any update on this issue fix? We have been waiting for the issue to get fixed in following kernel, but evidently not.

There is a candidate for a fix, which might make it into the next errata kernel. Please open a case for more details and to stay informed on the time plan, and if the time plan gets changed.

Thanks for the update, Christian! I do have a case open but support team dont have visibility for the timeline on the fix.

I had a read over the case, I think the last updates in the case are in line with what I commented here in this thread.

I also see this problem in RELEASE: 4.18.0-240.22.1.el8_3.s390x [132112.506475] list_del corruption. next->prev should be 000040000b4bc148, but was 0000000000000200

and in RELEASE: 4.18.0-193.el8.s390x I also saw: [1655815.174757] list_add corruption. prev->next should be next (00000002d68e3518), but was 0000000000000100. (prev=000040000b2eba88)

Hi Aleksandra Pavic, Would you please open a ticket? or can you share the full call trace with us?

Here the info from the dmesg, but if you need more info for one or both problems (list_del and list_add) I can open a ticket. Just say for which one, please.

[132112.506475] list_del corruption. next->prev should be 000040000b4bc148, but was 0000000000000200
[132112.506505] ------------[ cut here ]------------
[132112.506506] kernel BUG at lib/list_debug.c:56!
[132112.506538] illegal operation: 0001 ilc:1 [#1] SMP
[132112.506540] Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache 8021q garp mrp stp llc bonding sunrpc dm_service_time dm_multipath ghash_s390 prng xts aes_s390 des_s390 des_generic sha3_512_s390 sha3_256_s390 sha512_s390 eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio binfmt_misc ip_tables ext4 mbcache jbd2 sd_mod sg qeth_l2 zfcp dasd_eckd_mod qeth ccwgroup scsi_transport_fc qdio dasd_mod dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt [last unloaded: tracedev]
[132112.506576] CPU: 3 PID: 1334857 Comm: gpfsDriver_fvtc Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-240.22.1.el8_3.s390x #1
[132112.506577] Hardware name: IBM 3906 M05 710 (LPAR)
[132112.506579] Krnl PSW : 0704d00180000000 00000001aa97d9c4 (__list_del_entry_valid+0x94/0xc0)
[132112.506588]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[132112.506591] Krnl GPRS: 0000000000000000 00000001ab327ac8 0000000000000054 00000003f12fab08
[132112.506593]            00000003f130b300 0000000000000000 0000000000000001 0000000200000000
[132112.506594]            000040000b4bc148 00000001aa7b6d3a 00000002d2f05800 00000001bc480a00
[132112.506597]            00000002deb8a200 00000001bc480a00 00000001aa97d9c0 00000002e24cfc30
[132112.506641] Krnl Code: 00000001aa97d9b4: c0200026bb07       larl    %r2,1aae54fc2
           00000001aa97d9ba: c0e5ffe41ebb       brasl   %r14,1aa601730
          #00000001aa97d9c0: a7f40001           brc     15,1aa97d9c2
          >00000001aa97d9c4: b9040032           lgr     %r3,%r2
           00000001aa97d9c8: c0200026bade       larl    %r2,1aae54f84
           00000001aa97d9ce: c0e5ffe41eb1       brasl   %r14,1aa601730
           00000001aa97d9d4: a7f40001           brc     15,1aa97d9d6
           00000001aa97d9d8: b9040032           lgr     %r3,%r2
[132112.506657] Call Trace:
[132112.506660] ([<00000001aa97d9c0>] __list_del_entry_valid+0x90/0xc0)
[132112.506665]  [<00000001aa5725c2>] page_table_alloc+0xd2/0x2a0
[132112.506669]  [<00000001aa7b6d3a>] do_huge_pmd_anonymous_page+0x1aa/0x718
[132112.506674]  [<00000001aa768ad8>] __handle_mm_fault+0x9f0/0xaa0
[132112.506676]  [<00000001aa768c6c>] handle_mm_fault+0xe4/0x188
[132112.506681]  [<00000001aa56bf96>] do_dat_exception+0x176/0x400
[132112.506687]  [<00000001aaca6296>] pgm_check_handler+0x1ae/0x204
[132112.506688] Last Breaking-Event-Address:
[132112.506690]  [<00000001aa97d9c0>] __list_del_entry_valid+0x90/0xc0

and the other one:

[1655815.174757] list_add corruption. prev->next should be next (00000002d68e3518), but was 0000000000000100. (prev=000040000b2eba88).      <-------!!!! corrupded list    19 days 03:56:55.
[1655815.174797] ------------[ cut here ]------------
[1655815.174803] kernel BUG at lib/list_debug.c:28!
[1655815.174872] illegal operation: 0001 ilc:1 [#1] SMP
[1655815.174881] Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) nls_utf8 isofs loop nfsv3 nfs_acl rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache 8021q garp mrp stp llc bonding dm_service_time dm_multipath ghash_s390 prng xts aes_s390 des_s390 des_generic sha3_512_s390 sha3_256_s390 sha512_s390 qeth_l2 zcrypt_cex4 binfmt_misc auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 sd_mod sg qeth_l3 qeth ccwgroup dasd_eckd_mod dasd_fba_mod dasd_mod zfcp scsi_transport_fc qdio dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt [last unloaded: tracedev]
[1655815.174930] CPU: 1 PID: 2945398 Comm: gpfsDriver_regc Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-193.el8.s390x #1
[1655815.174935] Hardware name: IBM 3906 M05 710 (z/VM 7.2.0)
[1655815.174951] Krnl PSW : 0704d00180000000 0000000103bb5b20 (__list_del_entry_valid+0x0/0xc0)    <----!!!!
[1655815.174962]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[1655815.174964] Krnl GPRS: 000000000000378d 000000010445fac8 0000000000000075 00000003f534db08
[1655815.174966]            00000003f535e300 0000000000000000 00000001039a76b2 000040000b2d5fc8
[1655815.174968]            00000002e04e3dd0 00000002cb57f800 00000002d68e3200 00000002d68e3518
[1655815.174969]            00000002cfca4400 0000000000000001 0000000103bb5b1c 00000002e04e3b60
[1655815.174979] Krnl Code: 0000000103bb5b10: c020002545af      larl    %r2,10405e66e
           0000000103bb5b16: c0e5ffe4f921       brasl   %r14,103854d58
          #0000000103bb5b1c: a7f40001           brc     15,103bb5b1e
          >0000000103bb5b20: ebeff0880024       stmg    %r14,%r15,136(%r15)
           0000000103bb5b26: a7f13fe0           tmll    %r15,16352
           0000000103bb5b2a: b90400ef           lgr     %r14,%r15
           0000000103bb5b2e: a7840001           brc     8,103bb5b30
           0000000103bb5b32: e3f0ffe8ff71       lay     %r15,-24(%r15)
[1655815.174996] Call Trace:
[1655815.174998] ([<0000000103bb5b1c>] __list_add_valid+0xa4/0xa8)
[1655815.175003]  [<00000001037c9ace>] page_table_free_rcu+0x136/0x198
[1655815.175009]  [<00000001039a76b2>] free_pgd_range+0x2fa/0x6e8
[1655815.175010]  [<00000001039a7bba>] free_pgtables+0x11a/0x158
[1655815.175013]  [<00000001039b10fc>] unmap_region+0xe4/0x120
[1655815.175015]  [<00000001039b343e>] do_munmap+0x256/0x418
[1655815.175017]  [<00000001039b3676>] vm_munmap+0x76/0xa8
[1655815.175020]  [<00000001039b36e8>] sys_munmap+0x40/0x50
[1655815.175024]  [<0000000103ebd476>] system_call+0x2aa/0x2c8
[1655815.175025] Last Breaking-Event-Address:
[1655815.175027]  [<0000000103bb5b1c>] __list_add_valid+0xa4/0xa8
[1655815.175028]
[1655815.175029] Kernel panic - not syncing: Fatal exception in interrupt

Aleksandra Pavic Thank you for sharing the call trace with me. It seems this is not the same one to the issue in this solution. Please contact Red Hat Technical Support and open a ticket with providing vmcores and sosreport. https://access.redhat.com/support/policy/support_process

Thanks,

I have opend two bugzillas: 193973 and 193974, but it looks like they will be closed befor I finish uploading the crashes, saying it is fixed for RH8.4. Can I assume then that this is fixed?

I would like to ask when the bug will be fixed in CentOS Stream 8. There are kernels a little newer than in RHEL 8.4, and they probably still do not have the fix (hosts are affected from time to time by kernel crashing with this bug). Thank you.

Where is this bug (list_del corruption with LIST_POISON2) is captured under this below errata ? https://access.redhat.com/errata/RHSA-2021:3057 -

The bug can only be hit with rhel8.4 kernel versions that are newer than or equal to 4.18.0-305.el8 but are older than 4.18.0-305.12.1.el8_4 because it's caused by upstream 3de7d4f25a74 that was introduced to 4.18.0-305.el8 and onwards.

what should do i understand from above statement?

Hello Sir,

what should do i understand from above statement?

The bug can only be hit on versions within the range of 4.18.0-305.el8 <===> 4.18.0-305.10.2.el8_4

If you are running 8.4.z kernel of which version is within the range, then please upgrade the kernel to - kernel-4.18.0-305.12.1.el8_4 or newer (8.4.z) or - kernel-4.18.0-348.el8 or newer (8.5 or newer) and then check whether the issue resolves and never comes back after that.

Please open a support case with Red Hat support if you have any further questions or concerns regarding the issue/bug.

Hello Seiji,

Thank you very much for the clarification.

I also encountered the same issue on RHEL8.6, kernel 4.18.0-372.9.1.el8.x86_64. RedHat bug#2154842

Description of problem: When boot RHEL8.6, output shows "kernel BUG at lib/list_debug.c:28!" once in a while. crash log listed as below.

[   13.685975] list_del corruption. next->prev should be ff13ccac58102a88, but was ff13ccad05737568
[   13.690886]  nvme5n1: p1
[   13.690969]  nvme4n1: p1
[   13.701512] ------------[ cut here ]------------
[   13.701514] kernel BUG at lib/list_debug.c:56!
[   13.741589] ata1: SATA link down (SStatus 0 SControl 300)
[   13.752833] invalid opcode: 0000 [#1] SMP NOPTI
[   13.831922] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G           OE    --------- -  - 4.18.0-372.9.1.el8.x86_64 #1
[   13.831929] Hardware name: Lenovo ThinkSystem SR645 V3/SB27B31173, BIOS KAE105F-1.20 12/01/2022
[   13.831934] Workqueue: events once_deferred
[   13.831949] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
[   13.876521] Code: 65 b2 8f e8 fa ed c9 ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 c0 65 b2 8f e8 e6 ed c9 ff 0f 0b 48 c7 c7 70 66 b2 8f e8 d8 ed c9 ff <0f> 0b 489 fe 48 c7 c7 30 66 b2 8f e8 c4 ed c9 ff 0f 0b
[   13.876524] RSP: 0018:ff334141c019fe98 EFLAGS: 00010046
[   13.876527] RAX: 0000000000000054 RBX: ff13ccac58102a80 RCX: 0000000000000000
[   13.924881] RDX: 0000000000000000 RSI: ff13ccea8c216758 RDI: ff13ccea8c216758
[   13.924885] RBP: ff13ccea8c22a740 R08: 0000000000000000 R09: 0000000000aaaaaa
[   13.924886] R10: 0000000000000000 R11: ff334141daaff020 R12: ff13ccea8c230900
[   13.924887] R13: 0000000000000000 R14: ff13ccac4017b140 R15: ff13ccac58102a88
[   13.924888] FS:  0000000000000000(0000) GS:ff13ccea8c200000(0000) knlGS:0000000000000000
[   13.924891] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   13.924892] CR2: 00007f1f02e7da58 CR3: 0000005c77410002 CR4: 0000000000771ef0
[   13.981197] ata9: SATA link down (SStatus 0 SControl 300)
[   13.983830] PKRU: 55555554
[   13.983836] Call Trace:
[   13.983843]  process_one_work+0x10e/0x360
[   14.035979]  ? create_worker+0x1a0/0x1a0
[   14.043937]  worker_thread+0x30/0x390
[   14.043940]  ? create_worker+0x1a0/0x1a0
[   14.043944]  kthread+0x10a/0x120
[   14.066368]  ? set_kthread_struct+0x40/0x40
[   14.069280] ata2: SATA link down (SStatus 0 SControl 300)
[   14.074471]  ret_from_fork+0x35/0x40
[   14.091529] Modules linked in: crc32c_intel nvme(+) ahci(+) cdc_ether libahci nvme_core usbnet libata i40e(OE+) t10_pi ice(OE+) mii sunrpc
[   14.112953] ---[ end trace 9f08afe8c9138e54 ]---
[   26.124197] ata3: SATA link down (SStatus 0 SControl 300)
[   30.555719] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c
[   30.574306] Code: 65 b2 8f e8 fa ed c9 ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 c0 65 b2 8f e8 e6 ed c9 ff 0f 0b 48 c7 c7 70 66 b2 8f e8 d8 ed c9 ff <0f> 0b 489 fe 48 c7 c7 30 66 b2 8f e8 c4 ed c9 ff 0f 0b
[   30.602932] RSP: 0018:ff334141c019fe98 EFLAGS: 00010046
[   30.612572] RAX: 0000000000000054 RBX: ff13ccac58102a80 RCX: 0000000000000000
[   30.624378] RDX: 0000000000000000 RSI: ff13ccea8c216758 RDI: ff13ccea8c216758
[   30.636141] RBP: ff13ccea8c22a740 R08: 0000000000000000 R09: 0000000000aaaaaa
[   30.647968] R10: 0000000000000000 R11: ff334141daaff020 R12: ff13ccea8c230900
[   30.659854] R13: 0000000000000000 R14: ff13ccac4017b140 R15: ff13ccac58102a88
[   30.671781] FS:  0000000000000000(0000) GS:ff13ccea8c200000(0000) knlGS:0000000000000000
[   30.684876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   30.695274] CR2: 00007f1f02e7da58 CR3: 0000005c77410002 CR4: 0000000000771ef0
[   30.707222] PKRU: 55555554
[   30.714224] Kernel panic - not syncing: Fatal exception
[   32.125203] Shutting down cpus with NMI
[   32.133540] Kernel Offset: 0xda00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   57.108713] ---[ end Kernel panic - not syncing: Fatal exception ]---

I see in the bugzilla you opened, that also 3rd party kernel module is involved. While I see that engineering has looked at the bugzilla, I recommend to also open a customer portal case.

Thanks, Christian We have found RHEL8.6 miss a upstream kernel patch "df81dfcfd699 genirq: Fix reference leaks on irq affinity notifiers".