RHEL8.7 system fails to boot and hangs with soft lockup and rcu_sched CPU stall as soon as the mfe_aac_100713183 module is loaded
Issue
- System hangs with soft lockup and rcu_sched CPU stall events after updating the kernel from version
4.18.0-425.3.1.el8to4.18.0-425.10.1.el8_7or higher - RHEL8.7 system with the kernel version
4.18.0-425.10.1.el8_7or higher fails to boot and hangs with the following call traces as soon as the[mfe_aac_100713183]module is loaded.
[ 8.280855] mfe_aac_100713183: loading out-of-tree module taints kernel.
[ 8.280965] mfe_aac_100713183: module verification failed: signature and/or required key missing - tainting kernel
[ 8.286528] AAC rule matching/reporting engine initialized successfully
[ 64.018837] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [nessus-agent-mo:1357]
[ 64.018846] Modules linked in: mfe_aac_100713183(OE+) nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink sunrpc vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vfat fat intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul crc32_pclmul vmw_balloon ghash_clmulni_intel rapl joydev pcspkr vmw_vmci ext4 mbcache jbd2 sr_mod sd_mod cdrom t10_pi sg ata_generic vmwgfx drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci crc32c_intel serio_raw libahci ata_piix vmxnet3 libata vmw_pvscsi dm_mirror dm_region_hash dm_log dm_mod fuse
[ 64.018893] CPU: 0 PID: 1357 Comm: nessus-agent-mo Tainted: GOEL ----------- 4.18.0-425.10.1.el8_7.x86_64 #1
[ 64.018896] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
[ 64.018897] RIP: 0010:native_queued_spin_lock_slowpath+0x24/0x1c0
[ 64.018903] Code: ff ff 0f 1f 40 00 0f 1f 44 00 00 0f 1f 44 00 00 ba 01 00 00 00 8b 07 85 c0 75 0d f0 0f b1 17 85 c0 75 f2 e9 6e b7 aa 00 f3 90 <eb> e9 81 fe 00 01 00 00 74 44 81 e6 00 ff ff ff 75 71 f0 0f ba 2f
[ 64.018913] RSP: 0018:ffffb45f82253cb8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[ 64.018916] RAX: 0000000000000001 RBX: ffffb45f82253dc0 RCX: ffffb45f82253dc0
[ 64.018917] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffffc0a21e24
[ 64.018919] RBP: ffffb45f82253de0 R08: 0000000000000000 R09: 0000000000000000
[ 64.018920] R10: ffff981b4493a080 R11: 0000000000000001 R12: ffff981b56cfbc00
[ 64.018921] R13: 000000000000054d R14: 0000000000000001 R15: 0000000000000012
[ 64.018923] FS: 00007fe4af5470c0(0000) GS:ffff981c77c00000(0000) knlGS:0000000000000000
[ 64.018925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 64.018927] CR2: 0000555c40263000 CR3: 000000012a708002 CR4: 00000000007706f0
[ 64.018934] PKRU: 55555554
[ 64.018935] Call Trace:
[ 64.018937] _raw_spin_lock+0x1e/0x30
[ 64.018943] mfe_aac_is_pid_registered+0x1a/0x80 [mfe_aac_100713183]
[ 64.018954] mfe_aac_handle_auth_events+0x180/0x330 [mfe_aac_100713183]
[ 64.018962] mfe_aac_process_pre_events+0xfa/0x180 [mfe_aac_100713183]
[ 64.018971] mfe_aac_sys_openat_64_bit+0x263/0x2e0 [mfe_aac_100713183]
[ 64.018979] ? __slab_free+0x1f7/0x350
[ 64.018983] ? __fput+0x12c/0x250
[ 64.018985] ? kmem_cache_free+0x2d6/0x300
[ 64.018988] ? __audit_syscall_entry+0xf2/0x140
[ 64.018992] ? syscall_trace_enter+0x1ff/0x2d0
[ 64.018997] ? _cond_resched+0x15/0x30
[ 64.019000] ? do_syscall_64+0x5b/0x1b0
[ 64.019003] do_syscall_64+0x5b/0x1b0
[ 64.019006] entry_SYSCALL_64_after_hwframe+0x61/0xc6
[ 64.019009] RIP: 0033:0x7fe4aed1821f
[ 64.019011] Code: 52 89 f0 25 00 00 41 00 3d 00 00 41 00 74 44 8b 05 36 d2 20 00 85 c0 75 65 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 9d 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
[ 64.019013] RSP: 002b:00007ffe2b3f7990 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[ 64.019015] RAX: ffffffffffffffda RBX: 00007ffe2b3f7b10 RCX: 00007fe4aed1821f
[ 64.019016] RDX: 0000000000000000 RSI: 00007ffe2b3f7b70 RDI: 00000000ffffff9c
[ 64.019017] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 64.019018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 64.019020] R13: 0000000000000000 R14: 00007ffe2b3f7b70 R15: 0000000000000000
[ 68.283677] rcu: INFO: rcu_sched self-detected stall on CPU
[ 68.283685] rcu: 1-...!: (59999 ticks this GP) idle=fd6/1/0x4000000000000002 softirq=21675/21675 fqs=1
[ 68.283692] (t=60000 jiffies g=10977 q=291)
[ 68.283694] rcu: rcu_sched kthread starved for 59996 jiffies! g10977 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 68.283699] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 68.283701] rcu: RCU grace-period kthread stack dump:
[ 68.283704] task:rcu_sched state:R running task stack: 0 pid: 13 ppid: 2 flags:0x80004080
[ 68.283708] Call Trace:
[ 68.283710] __schedule+0x2d1/0x860
[ 68.283717] schedule+0x35/0xa0
[ 68.283719] schedule_timeout+0x197/0x300
[ 68.283723] ? __next_timer_interrupt+0xf0/0xf0
[ 68.283727] ? __prepare_to_swait+0x4f/0x80
[ 68.283731] rcu_gp_kthread+0x4ee/0x790
[ 68.283736] ? rcu_gp_cleanup+0x370/0x370
[ 68.283738] kthread+0x10b/0x130
[ 68.283743] ? set_kthread_struct+0x50/0x50
[ 68.283746] ret_from_fork+0x1f/0x40
[ 68.283750] rcu: Stack dump where RCU GP kthread last ran:
Environment
- Red Hat Enterprise Linux release 8.7 (Ootpa)
- kernel-4.18.0-425.10.1.el8_7 or higher
- Trellix: Out-of-tree (O) kernel module:
[mfe_aac/mfe_aac_100713183]
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.