RHEL8.7 system fails to boot and hangs with soft lockup and rcu_sched CPU stall as soon as the mfe_aac_100713183 module is loaded

Solution Verified - Updated -

Issue

  • System hangs with soft lockup and rcu_sched CPU stall events after updating the kernel from version 4.18.0-425.3.1.el8 to 4.18.0-425.10.1.el8_7 or higher
  • RHEL8.7 system with the kernel version 4.18.0-425.10.1.el8_7 or higher fails to boot and hangs with the following call traces as soon as the [mfe_aac_100713183] module is loaded.
[    8.280855] mfe_aac_100713183: loading out-of-tree module taints kernel.
[    8.280965] mfe_aac_100713183: module verification failed: signature and/or required key missing - tainting kernel
[    8.286528] AAC rule matching/reporting engine initialized successfully
[   64.018837] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [nessus-agent-mo:1357]
[   64.018846] Modules linked in: mfe_aac_100713183(OE+) nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink sunrpc vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vfat fat intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul crc32_pclmul vmw_balloon ghash_clmulni_intel rapl joydev pcspkr vmw_vmci ext4 mbcache jbd2 sr_mod sd_mod cdrom t10_pi sg ata_generic vmwgfx drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci crc32c_intel serio_raw libahci ata_piix vmxnet3 libata vmw_pvscsi dm_mirror dm_region_hash dm_log dm_mod fuse
[   64.018893] CPU: 0 PID: 1357 Comm: nessus-agent-mo Tainted: GOEL  ----------- 4.18.0-425.10.1.el8_7.x86_64 #1
[   64.018896] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
[   64.018897] RIP: 0010:native_queued_spin_lock_slowpath+0x24/0x1c0
[   64.018903] Code: ff ff 0f 1f 40 00 0f 1f 44 00 00 0f 1f 44 00 00 ba 01 00 00 00 8b 07 85 c0 75 0d f0 0f b1 17 85 c0 75 f2 e9 6e b7 aa 00 f3 90 <eb> e9 81 fe 00 01 00 00 74 44 81 e6 00 ff ff ff 75 71 f0 0f ba 2f
[   64.018913] RSP: 0018:ffffb45f82253cb8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[   64.018916] RAX: 0000000000000001 RBX: ffffb45f82253dc0 RCX: ffffb45f82253dc0
[   64.018917] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffffc0a21e24
[   64.018919] RBP: ffffb45f82253de0 R08: 0000000000000000 R09: 0000000000000000
[   64.018920] R10: ffff981b4493a080 R11: 0000000000000001 R12: ffff981b56cfbc00
[   64.018921] R13: 000000000000054d R14: 0000000000000001 R15: 0000000000000012
[   64.018923] FS:  00007fe4af5470c0(0000) GS:ffff981c77c00000(0000) knlGS:0000000000000000
[   64.018925] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   64.018927] CR2: 0000555c40263000 CR3: 000000012a708002 CR4: 00000000007706f0
[   64.018934] PKRU: 55555554
[   64.018935] Call Trace:
[   64.018937]  _raw_spin_lock+0x1e/0x30
[   64.018943]  mfe_aac_is_pid_registered+0x1a/0x80 [mfe_aac_100713183]
[   64.018954]  mfe_aac_handle_auth_events+0x180/0x330 [mfe_aac_100713183]
[   64.018962]  mfe_aac_process_pre_events+0xfa/0x180 [mfe_aac_100713183]
[   64.018971]  mfe_aac_sys_openat_64_bit+0x263/0x2e0 [mfe_aac_100713183]
[   64.018979]  ? __slab_free+0x1f7/0x350
[   64.018983]  ? __fput+0x12c/0x250
[   64.018985]  ? kmem_cache_free+0x2d6/0x300
[   64.018988]  ? __audit_syscall_entry+0xf2/0x140
[   64.018992]  ? syscall_trace_enter+0x1ff/0x2d0
[   64.018997]  ? _cond_resched+0x15/0x30
[   64.019000]  ? do_syscall_64+0x5b/0x1b0
[   64.019003]  do_syscall_64+0x5b/0x1b0
[   64.019006]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[   64.019009] RIP: 0033:0x7fe4aed1821f
[   64.019011] Code: 52 89 f0 25 00 00 41 00 3d 00 00 41 00 74 44 8b 05 36 d2 20 00 85 c0 75 65 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 9d 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
[   64.019013] RSP: 002b:00007ffe2b3f7990 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[   64.019015] RAX: ffffffffffffffda RBX: 00007ffe2b3f7b10 RCX: 00007fe4aed1821f
[   64.019016] RDX: 0000000000000000 RSI: 00007ffe2b3f7b70 RDI: 00000000ffffff9c
[   64.019017] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   64.019018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   64.019020] R13: 0000000000000000 R14: 00007ffe2b3f7b70 R15: 0000000000000000
[   68.283677] rcu: INFO: rcu_sched self-detected stall on CPU
[   68.283685] rcu:     1-...!: (59999 ticks this GP) idle=fd6/1/0x4000000000000002 softirq=21675/21675 fqs=1 
[   68.283692]  (t=60000 jiffies g=10977 q=291)
[   68.283694] rcu: rcu_sched kthread starved for 59996 jiffies! g10977 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[   68.283699] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   68.283701] rcu: RCU grace-period kthread stack dump:
[   68.283704] task:rcu_sched       state:R  running task     stack:    0 pid:   13 ppid:     2 flags:0x80004080
[   68.283708] Call Trace:
[   68.283710]  __schedule+0x2d1/0x860
[   68.283717]  schedule+0x35/0xa0
[   68.283719]  schedule_timeout+0x197/0x300
[   68.283723]  ? __next_timer_interrupt+0xf0/0xf0
[   68.283727]  ? __prepare_to_swait+0x4f/0x80
[   68.283731]  rcu_gp_kthread+0x4ee/0x790
[   68.283736]  ? rcu_gp_cleanup+0x370/0x370
[   68.283738]  kthread+0x10b/0x130
[   68.283743]  ? set_kthread_struct+0x50/0x50
[   68.283746]  ret_from_fork+0x1f/0x40
[   68.283750] rcu: Stack dump where RCU GP kthread last ran:

Environment

  • Red Hat Enterprise Linux release 8.7 (Ootpa)
  • kernel-4.18.0-425.10.1.el8_7 or higher
  • Trellix: Out-of-tree (O) kernel module: [mfe_aac/mfe_aac_100713183]

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content