The kernel crashes after rcu_do_batch() due to "exception RIP: unknown or invalid address"

Solution Verified - Updated -

Issue

  • The kernel crashes with the following messages:
[945551.027107] WARNING: CPU: 57 PID: 694906 at lib/radix-tree.c:591 delete_node+0x73/0x1d0
[945551.027108] Modules linked in: ...
[945551.027159] CPU: 57 PID: 694906 Comm: kworker/57:4 Kdump: loaded Tainted: P           OE    --------- -  - 4.18.0-305.72.1.el8_4.x86_64 #1
[945551.027160] Hardware name: HPE Synergy 480 Gen10/Synergy 480 Gen10 Compute Module, BIOS I42 06/01/2022
[945551.027165] Workqueue: events percpu_stats_free_rwork_fn
[945551.027167] RIP: 0010:delete_node+0x73/0x1d0
[945551.027169] Code: 18 48 85 db 75 c3 8b 45 04 a8 04 75 08 25 ff ff 7f 00 89 45 04 48 c7 45 08 00 00 00 00 48 8b 46 18 48 39 c7 0f 84 05 01 00 00 <0f> 0b 4c 89 e6 e8 03 f9 82 ff 48 85 db 75 b2 bb 01 00 00 00 89 d8
[945551.027170] RSP: 0018:ffffa36335dabe38 EFLAGS: 00010282
[945551.027170] RAX: 0000000000000000 RBX: ffff9530702df8e8 RCX: 0000000000000007
[945551.027171] RDX: 0000000000000000 RSI: ffff94dcaf7cd220 RDI: ffff94dcaf7cd238
[945551.027172] RBP: ffffffff96ceee40 R08: 0000000000000228 R09: ffff94dcaf7cd288
[945551.027173] R10: ffff94dcaf7cd220 R11: 0000000000000000 R12: ffffffff95f2f750
[945551.027173] R13: 0000000000000000 R14: ffff947b1a1a23c0 R15: ffff947db26d3508
[945551.027174] FS:  0000000000000000(0000) GS:ffff94d07fb40000(0000) knlGS:0000000000000000
[945551.027175] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[945551.027175] CR2: 000055ad01597c30 CR3: 000000b5a4e10002 CR4: 00000000007726e0
[945551.027176] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[945551.027176] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[945551.027177] PKRU: 55555554
[945551.027177] Call Trace:
[945551.027183]  radix_tree_delete_item+0x69/0xc0
[945551.027185]  mem_cgroup_id_put_many+0x37/0x50
[945551.027191]  process_one_work+0x1a7/0x360
[945551.027192]  worker_thread+0x30/0x390
[945551.027193]  ? create_worker+0x1a0/0x1a0
[945551.027196]  kthread+0x116/0x130
[945551.027197]  ? kthread_flush_work_fn+0x10/0x10
[945551.027202]  ret_from_fork+0x1f/0x40
[945551.027204] ---[ end trace 439aedfdfc9478ee ]---
[945551.039888] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[945551.047535] BUG: unable to handle kernel paging request at ffff94dcaf7cd238
[945551.054640] PGD c03fdfc067 P4D c03fdfc067 PUD b593e77063 PMD 8000006aef6001e3 
[945551.062014] Oops: 0011 [#1] SMP NOPTI
[945551.065795] CPU: 57 PID: 0 Comm: swapper/57 Kdump: loaded Tainted: P        W  OE    --------- -  - 4.18.0-305.72.1.el8_4.x86_64 #1
[945551.077804] Hardware name: HPE Synergy 480 Gen10/Synergy 480 Gen10 Compute Module, BIOS I42 06/01/2022
[945551.087283] RIP: 0010:0xffff94dcaf7cd238
[945551.091326] Code: 00 00 00 00 00 00 00 00 00 00 38 eb 7c af dc 94 ff ff 00 07 00 00 00 00 00 00 e8 f8 2d 70 30 95 ff ff 40 ee ce 96 ff ff ff ff <38> d2 7c af dc 94 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[945551.110335] RSP: 0018:ffffa36319a7ced8 EFLAGS: 00010282
[945551.115687] RAX: ffff94dcaf7cd238 RBX: 000000000000000a RCX: 0000000000000010
[945551.122966] RDX: ffff94dcaf7cd238 RSI: 0000000000000000 RDI: ffff94dcaf7cd238
[945551.130244] RBP: 0000000000000000 R08: 0000000000000010 R09: 0000000000000100
[945551.137518] R10: ffff94d07fb69f40 R11: 0000000000000001 R12: 0000000000000000
[945551.144796] R13: ffff94d07fb6ad00 R14: ffffffff957602bf R15: ffff94d07fb6ad50
[945551.152074] FS:  0000000000000000(0000) GS:ffff94d07fb40000(0000) knlGS:0000000000000000
[945551.160311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[945551.166187] CR2: ffff94dcaf7cd238 CR3: 000000b5a4e10002 CR4: 00000000007726e0
[945551.173464] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[945551.180740] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[945551.188016] PKRU: 55555554
[945551.190831] Call Trace:
[945551.193383]  <IRQ>
[945551.195504]  ? rcu_do_batch+0x1a4/0x3d0
[945551.199456]  ? rcu_core+0x18f/0x2f0
[945551.203065]  ? __do_softirq+0xd7/0x2d6
[945551.206930]  ? irq_exit+0xf7/0x100
[945551.210445]  ? smp_apic_timer_interrupt+0x74/0x130
[945551.215362]  ? apic_timer_interrupt+0xf/0x20
[945551.219751]  </IRQ>
[945551.221958]  ? cpuidle_enter_state+0xd9/0x3c0
[945551.226433]  ? cpuidle_enter_state+0xb4/0x3c0
[945551.230910]  ? cpuidle_enter+0x2c/0x40
[945551.234776]  ? do_idle+0x234/0x260
[945551.238291]  ? cpu_startup_entry+0x6f/0x80
[945551.242509]  ? start_secondary+0x198/0x1e0
[945551.246725]  ? secondary_startup_64_no_verify+0xc2/0xcb
[945551.252077] Modules linked in: ...
[945551.375118] CR2: ffff94dcaf7cd238

Environment

  • Red Hat Enterprise Linux 8.4.z kernel-4.18.0-305.62.1.el8_4 and onwards but older than kernel-4.18.0-305.103.1.el8_4
  • Red Hat Enterprise Linux 8.6.z kernel-4.18.0-372.26.1.el8_6 and onwards but older than kernel-4.18.0-372.70.1.el8_6
  • Red Hat Enterprise Linux 8.7 GA kernel-4.18.0-425.3.1.el8 and onwards
  • Red Hat Enterprise Linux 8.8 GA kernel-4.18.0-477.10.1.el8_8 and onwards but older than kernel-4.18.0-477.27.1.el8_8
  • Memory Cgroup

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content