Server crashed due to deadlock in 3rd party McAfee module
Environment
- Red Hat Enterprise Linux
- McAfee Endpoint Security
Issue
- Server rebooted unexpectedly/crashed in the presence of third party McAfee kernel module.
Resolution
- Engage the respective module vendor with regards to McAfee's Endpoint Security.
Workaround
-
Remove or blacklist the McAfee kernel module:
Root Cause
- McAfee kernel module caused a deadlock, leading to a hang and subsequent crash.
Diagnostic Steps
Pre-requisites
-
Deploy kdump in Order to Collect a vmcore:
- Vmcore analyis is required to determine if you are being impacted by this issue. This first requires that a vmcore is dumped successfully.
- If the
kexec-tools
package is absent or thekdump
service is inactive, please reference the following article to install, enable, start, and configure kdump:
How to troubleshoot kernel crashes, hangs, or reboots with kdump on Red Hat Enterprise Linux
-
Prepare crash Environment for vmcore Analysis:
- Please reference the following article to set up a vmcore analysis environment:
How to set up a vmcore analysis environment?
- Please reference the following article to set up a vmcore analysis environment:
Vmcore Analysis
-
The vmcore was created due to the
cssdmonitor
task intentionally panicking the system with a sysrq. This is a cluster eviction.crash> bt PID: 3151310 TASK: ffff89a901f6c000 CPU: 35 COMMAND: "cssdmonitor" #0 [ffff9f4ab5af3cd8] machine_kexec at ffffffffacc6bec3 #1 [ffff9f4ab5af3d30] __crash_kexec at ffffffffacdb564a #2 [ffff9f4ab5af3df0] panic at ffffffffaccf70bf #3 [ffff9f4ab5af3e70] sysrq_handle_crash at ffffffffad1fad91 #4 [ffff9f4ab5af3e78] __handle_sysrq.cold.13 at ffffffffad1fb6b4 #5 [ffff9f4ab5af3ea8] write_sysrq_trigger at ffffffffad1fb55b #6 [ffff9f4ab5af3eb8] proc_reg_write at ffffffffacfea7a9 #7 [ffff9f4ab5af3ed0] vfs_write at ffffffffacf64be5 #8 [ffff9f4ab5af3f00] ksys_write at ffffffffacf64e6f #9 [ffff9f4ab5af3f38] do_syscall_64 at ffffffffacc052fb #10 [ffff9f4ab5af3f50] entry_SYSCALL_64_after_hwframe at ffffffffad8000a9
-
At the time of the panic, the load averages were elevated due to the number of blocked uninterruptible state tasks.
LOAD AVERAGE: 1207.03, 1156.46, 890.58 crash> ps -S RU: 56 IN: 3884 ID: 472 UN: 1198 ZO: 1
-
The longest blocked task was PID 5718.
crash> ps -m | grep UN | tail -n 3 [ 0 00:20:50.614] [UN] PID: 2131812 TASK: ffff89c68fab0000 CPU: 17 COMMAND: "java" [ 0 00:23:37.869] [UN] PID: 1571647 TASK: ffff89a2d3790000 CPU: 10 COMMAND: "mfetpd" [ 0 00:23:37.870] [UN] PID: 5718 TASK: ffff896649458000 CPU: 30 COMMAND: "ClMgrS"
-
It was waiting for a user space task to be created on behalf of the NFS subsystem. This notification is through a completion structure.
crash> bt 5718 PID: 5718 TASK: ffff896649458000 CPU: 30 COMMAND: "ClMgrS" #0 [ffff9f4a9dba32a8] __schedule at ffffffffad5fa5d1 #1 [ffff9f4a9dba3300] schedule at ffffffffad5fabc5 #2 [ffff9f4a9dba3318] schedule_timeout at ffffffffad5ff6b8 #3 [ffff9f4a9dba33b0] wait_for_completion at ffffffffad5fb5f6 #4 [ffff9f4a9dba33f0] call_usermodehelper_exec at ffffffffacd12702 #5 [ffff9f4a9dba3430] call_sbin_request_key at ffffffffad028024 #6 [ffff9f4a9dba3538] request_key_and_link at ffffffffad028414 #7 [ffff9f4a9dba3600] request_key at ffffffffad0287ac #8 [ffff9f4a9dba3630] nfs_idmap_get_key at ffffffffc07c9acf [nfsv4] #9 [ffff9f4a9dba3688] nfs_idmap_lookup_id at ffffffffc07c9bb0 [nfsv4] #10 [ffff9f4a9dba36b8] nfs_map_name_to_uid at ffffffffc07ca1be [nfsv4] #11 [ffff9f4a9dba36f8] decode_getfattr_attrs at ffffffffc07c0bf5 [nfsv4] #12 [ffff9f4a9dba3778] decode_getfattr_generic.constprop.127 at ffffffffc07c10a2 [nfsv4] #13 [ffff9f4a9dba37d8] nfs4_xdr_dec_getattr at ffffffffc07c19ba [nfsv4] #14 [ffff9f4a9dba3828] call_decode at ffffffffc0716dfb [sunrpc] #15 [ffff9f4a9dba38a0] __rpc_execute at ffffffffc0726118 [sunrpc] #16 [ffff9f4a9dba38f8] rpc_execute at ffffffffc072e907 [sunrpc] #17 [ffff9f4a9dba3918] rpc_run_task at ffffffffc071897c [sunrpc] #18 [ffff9f4a9dba3958] nfs4_call_sync_custom at ffffffffc07a31bb [nfsv4] #19 [ffff9f4a9dba3968] nfs4_do_call_sync at ffffffffc07a3249 [nfsv4] #20 [ffff9f4a9dba39d8] _nfs4_proc_getattr at ffffffffc07a5fd8 [nfsv4] #21 [ffff9f4a9dba3a98] nfs4_proc_getattr at ffffffffc07aecf1 [nfsv4] #22 [ffff9f4a9dba3b00] __nfs_revalidate_inode at ffffffffc0aebb52 [nfs] #23 [ffff9f4a9dba3b28] nfs_access_get_cached at ffffffffc0ae3fb1 [nfs] #24 [ffff9f4a9dba3b78] nfs_do_access at ffffffffc0ae40d2 [nfs] #25 [ffff9f4a9dba3c10] nfs_permission at ffffffffc0ae439e [nfs] #26 [ffff9f4a9dba3c30] inode_permission at ffffffffacf71a5a #27 [ffff9f4a9dba3c58] link_path_walk at ffffffffacf72101 #28 [ffff9f4a9dba3cb8] path_lookupat at ffffffffacf723bb #29 [ffff9f4a9dba3d18] filename_lookup at ffffffffacf76a50 #30 [ffff9f4a9dba3e40] vfs_statx at ffffffffacf69784 #31 [ffff9f4a9dba3e98] __do_sys_newstat at ffffffffacf69dd9 #32 [ffff9f4a9dba3f38] do_syscall_64 at ffffffffacc052fb #33 [ffff9f4a9dba3f50] entry_SYSCALL_64_after_hwframe at ffffffffad8000a9 RIP: 00007fe702268f89 RSP: 00007fe6f30f0688 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe702268f89 RDX: 00007fe6f30f0950 RSI: 00007fe6f30f0950 RDI: 00007fe6d40064d0 RBP: 00007fe6f30f06a0 R8: 000000000000000c R9: 00007fe6fdd2329f R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe6fdc7f212 R13: 00007fe6f31f15ef R14: 00007fe6f31f16b0 R15: 00007fe6f30f0f40 ORIG_RAX: 0000000000000004 CS: 0033 SS: 002b
-
We can find the completion structure by referencing the assembly. Per x86 convention, we know that the first parameter of a routine is passed via register RDI.
Before calling
wait_for_completion()
, we see that the contents of register RSP is used to fill RDI, containing the completion structure.crash> whatis wait_for_completion void wait_for_completion(struct completion *); crash> dis -rl ffffffffacd12702 | tail /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/umh.c: 442 0xffffffffacd126ef <call_usermodehelper_exec+287>: xor %eax,%eax 0xffffffffacd126f1 <call_usermodehelper_exec+289>: xchg %rax,0x40(%rbx) 0xffffffffacd126f5 <call_usermodehelper_exec+293>: test %rax,%rax 0xffffffffacd126f8 <call_usermodehelper_exec+296>: jne 0xffffffffacd1267d <call_usermodehelper_exec+173> /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/umh.c: 447 0xffffffffacd126fa <call_usermodehelper_exec+298>: mov %rsp,%rdi 0xffffffffacd126fd <call_usermodehelper_exec+301>: call 0xffffffffad5fb560 <wait_for_completion>
-
Earlier in this routine, we can reference the assembly to build the stack and annotate it.
crash> dis -rl ffffffffacd12702 | head -n 20 /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/umh.c: 402 0xffffffffacd125d0 <call_usermodehelper_exec>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] 0xffffffffacd125d5 <call_usermodehelper_exec+0x5>: push %rbp 0xffffffffacd125d6 <call_usermodehelper_exec+0x6>: push %rbx 0xffffffffacd125d7 <call_usermodehelper_exec+0x7>: mov %rdi,%rbx 0xffffffffacd125da <call_usermodehelper_exec+0xa>: sub $0x28,%rsp #4 [ffff9f4a9dba33f0] call_usermodehelper_exec+0x132 at ffffffffacd12702 /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/umh.c: 449 ffff9f4a9dba33f8: ffffffff00000000 ffff89a500000000 ---. RSP^ | sub $0x28,%rsp ffff9f4a9dba3408: ffff9f4a9dba33c0 ffff9f4a9dba33c0 | ffff9f4a9dba3418: d1b5ac3f88cc4b00 ---' [ffff89a579d57be0:kmalloc-16] RBX^ ffff9f4a9dba3428: [ffff89add5035d00:key_jar] call_sbin_request_key+0x264 RBP^ RIP^ #5 [ffff9f4a9dba3430] call_sbin_request_key+0x264 at ffffffffad028024
-
Register RSP, contained
0xffff9f4a9dba33f8
. The completion is not marked as done yet.crash> struct completion ffff9f4a9dba33f8 struct completion { done = 0,
-
Per the source, we know that the subprocess_info structure used to create the userspace task is passed into
call_usermodehelper_exec()
as the first parameter.crash> whatis call_usermodehelper_exec int call_usermodehelper_exec(struct subprocess_info *, int);
-
This was passed via RDI into
call_usermodehelper_exec()
. The contents of RDI were then copied to register RBX. Register RBX was not modified for the remainder of the routine.crash> dis -rl ffffffffacd12702 | grep bx 0xffffffffacd125d6 <call_usermodehelper_exec+0x6>: push %rbx 0xffffffffacd125d7 <call_usermodehelper_exec+0x7>: mov %rdi,%rbx 0xffffffffacd12664 <call_usermodehelper_exec+0x94>: mov 0x70(%rbx),%rax 0xffffffffacd1266d <call_usermodehelper_exec+0x9d>: mov %rbx,%rdi 0xffffffffacd12675 <call_usermodehelper_exec+0xa5>: mov %rbx,%rdi 0xffffffffacd126a0 <call_usermodehelper_exec+0xd0>: pop %rbx 0xffffffffacd126f1 <call_usermodehelper_exec+0x121>: xchg %rax,0x40(%rbx) 0xffffffffacd12702 <call_usermodehelper_exec+0x132>: mov 0x64(%rbx),%ebp
-
Register RBX was then pushed to the stack on the third push of the following routine.
crash> dis -rl ffffffffad5fb5f6 | head /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/sched/completion.c: 137 0xffffffffad5fb560 <wait_for_completion>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] 0xffffffffad5fb565 <wait_for_completion+0x5>: push %r12 0xffffffffad5fb567 <wait_for_completion+0x7>: push %rbp 0xffffffffad5fb568 <wait_for_completion+0x8>: mov %rdi,%rbp 0xffffffffad5fb56b <wait_for_completion+0xb>: push %rbx /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/sched/completion.c: 105 0xffffffffad5fb56c <wait_for_completion+0xc>: lea 0x8(%rbp),%rbx /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/sched/completion.c: 137 0xffffffffad5fb570 <wait_for_completion+0x10>: sub $0x20,%rsp
-
Annotating the stack, we see that register RBX held address
0xffff89a7509dbd80
, thesubprocess_info
structure.#3 [ffff9f4a9dba33b0] wait_for_completion+0x96 at ffffffffad5fb5f6 /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/sched/completion.c: 86 ffff9f4a9dba33b8: [ffff896649458000:task_struct] ffff9f4a9dba3408 ---. ffff9f4a9dba33c8: ffff9f4a9dba3408 d1b5ac3f88cc4b00 ---'-- sub $0x20,%rsp ffff9f4a9dba33d8: [ffff89a7509dbd80:kmalloc-128] 0000000000000000 RBX^ RBP^ ffff9f4a9dba33e8: [ffff89add5034000:key_jar] call_usermodehelper_exec+0x132 R12^ RIP^ #4 [ffff9f4a9dba33f0] call_usermodehelper_exec+0x132 at ffffffffacd12702 crash> struct subprocess_info ffff89a7509dbd80 struct subprocess_info { work = { data = { counter = 0xc40 }, entry = { next = 0xffff89a7509dbd88, prev = 0xffff89a7509dbd88 }, func = 0xffffffffacd12770 <call_usermodehelper_exec_work>, { bdi_wb_backptr = 0x0, rh_kabi_hidden_111 = { rh_reserved1 = 0x0 }, {<No data fields>} }, rh_reserved2 = 0x0, rh_reserved3 = 0x0, rh_reserved4 = 0x0 }, complete = 0xffff9f4a9dba33f8, path = 0xffffffffadc576b0 <request_key> "/sbin/request-key", argv = 0xffff9f4a9dba3480, envp = 0xffff9f4a9dba3440, wait = 0x2, retval = 0x0, init = 0xffffffffad027d30 <umh_keys_init>, cleanup = 0xffffffffad027cd0 <umh_keys_cleanup>, data = 0xffff89add5035d00 }
-
The
subprocess_info
structure is found in the long blocked task, and also akworker
PID 1036030.crash> search -t ffff89a7509dbd80 PID: 5718 TASK: ffff896649458000 CPU: 30 COMMAND: "ClMgrS" ffff9f4a9dba33d8: ffff89a7509dbd80 PID: 1036030 TASK: ffff89accebf8000 CPU: 26 COMMAND: "kworker/u98:1" ffff9f4ab7527de0: ffff89a7509dbd80 ffff9f4ab7527e78: ffff89a7509dbd80 ffff9f4ab7527e90: ffff89a7509dbd80
-
The
kworker
PID 1036030 has been in interruptible state for almost 24 minutes.crash> ps -m 1036030 [ 0 00:23:37.870] [IN] PID: 1036030 TASK: ffff89accebf8000 CPU: 26 COMMAND: "kworker/u98:1"
-
This is it's call stack. It's waiting to be notified that a child PID has been created.
crash> bt 1036030 PID: 1036030 TASK: ffff89accebf8000 CPU: 26 COMMAND: "kworker/u98:1" #0 [ffff9f4ab7527d68] __schedule at ffffffffad5fa5d1 #1 [ffff9f4ab7527dc0] schedule at ffffffffad5fabc5 #2 [ffff9f4ab7527dd8] do_wait at ffffffffaccfb3e5 #3 [ffff9f4ab7527e10] kernel_wait at ffffffffaccfcbe7 #4 [ffff9f4ab7527e88] call_usermodehelper_exec_work at ffffffffacd127c9 #5 [ffff9f4ab7527e98] process_one_work at ffffffffacd15ba7 #6 [ffff9f4ab7527ed8] worker_thread at ffffffffacd16260 #7 [ffff9f4ab7527f10] kthread at ffffffffacd1d3d4 #8 [ffff9f4ab7527f50] ret_from_fork at ffffffffad80024f
-
We can verify that it's trying to create the user space PID from the same
subprocess_info
structure by referencing the assembly. The functionprocess_one_work()
passes awork_struct
structure to routinecall_usermodehelper_exec_work()
, which is then used as asubprocess_info
structure.157 static void call_usermodehelper_exec_work(struct work_struct *work) 158 { 159 struct subprocess_info *sub_info = 160 container_of(work, struct subprocess_info, work);
-
The assembly shows that the
work_struct
is passed from RBX to RDI before callingcall_usermodehelper_exec_work()
.crash> dis -rl ffffffffacd15ba7 | tail 0xffffffffacd15b8f <process_one_work+0x18f>: sti 0xffffffffacd15b90 <process_one_work+0x190>: nopw 0x0(%rax,%rax,1) /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/./arch/x86/include/asm/jump_label.h: 38 0xffffffffacd15b96 <process_one_work+0x196>: nopl 0x0(%rax,%rax,1) /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/workqueue.c: 2287 0xffffffffacd15b9b <process_one_work+0x19b>: mov 0x18(%r14),%rax 0xffffffffacd15b9f <process_one_work+0x19f>: mov %rbx,%rdi 0xffffffffacd15ba2 <process_one_work+0x1a2>: call 0xffffffffada01200 <__x86_indirect_thunk_rax>
-
RBX is then pushed to the stack on the first push of
call_usermodehelper_exec_work()
.crash> dis -rl ffffffffacd127c9 | head /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/umh.c: 158 0xffffffffacd12770 <call_usermodehelper_exec_work>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/umh.c: 162 0xffffffffacd12775 <call_usermodehelper_exec_work+0x5>: push %rbx
-
Annotating the stack, we see it's the same address as the
subprocess_info
structure0xffff89a7509dbd80
.#4 [ffff9f4ab7527e88] call_usermodehelper_exec_work+0x59 at ffffffffacd127c9 /usr/src/debug/kernel-4.18.0-477.10.1.el8_8/linux-4.18.0-477.10.1.el8_8.x86_64/kernel/umh.c: 136 ffff9f4ab7527e90: [ffff89a7509dbd80:kmalloc-128] process_one_work+0x1a7 RBX^ RIP^ #5 [ffff9f4ab7527e98] process_one_work+0x1a7 at ffffffffacd15ba7
-
The child PID that the kworker is waiting to be created is
nfsidmap
, which is in uninterruptible blocked state.crash> ps -c 1036030 PID: 1036030 TASK: ffff89accebf8000 CPU: 26 COMMAND: "kworker/u98:1" PID: 1087887 TASK: ffff89a589cf8000 CPU: 36 COMMAND: "nfsidmap" crash> ps -m 1087887 [ 0 00:00:42.304] [UN] PID: 1087887 TASK: ffff89a589cf8000 CPU: 36 COMMAND: "nfsidmap"
-
And is blocked after operating in the third party code provided by the
mfe_fileaccess_1007161049
third party kernel, which isn't provided by Red Hat.crash> bt 1087887 PID: 1087887 TASK: ffff89a589cf8000 CPU: 36 COMMAND: "nfsidmap" #0 [ffff9f4ab3d37d38] __schedule at ffffffffad5fa5d1 #1 [ffff9f4ab3d37d90] schedule at ffffffffad5fabc5 #2 [ffff9f4ab3d37da8] schedule_timeout at ffffffffad5ff5d7 #3 [ffff9f4ab3d37e40] mfe_fileaccess_get_orig_sys_close_addr_32_bit at ffffffffc0349d53 [mfe_fileaccess_1007161049] #4 [ffff9f4ab3d37ec8] mfe_fileaccess_sc_do_scan at ffffffffc034a41c [mfe_fileaccess_1007161049] #5 [ffff9f4ab3d37f00] mfe_fileaccess_sys_openat_64_bit at ffffffffc0348549 [mfe_fileaccess_1007161049] #6 [ffff9f4ab3d37f38] do_syscall_64 at ffffffffacc052fb #7 [ffff9f4ab3d37f50] entry_SYSCALL_64_after_hwframe at ffffffffad8000a9 RIP: 00007f8a3d49e6d2 RSP: 00007ffc39e7e840 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 000055ed229dbc50 RCX: 00007f8a3d49e6d2 RDX: 0000000000000000 RSI: 00007ffc39e82ac0 RDI: 00000000ffffff9c RBP: 0000000000000008 R8: 0000000000000008 R9: 0000000000000001 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8a3c3329d0 R13: 00007f8a3c3329d0 R14: 0000000000000001 R15: 0000000000000000 ORIG_RAX: 0000000000000101 CS: 0033 SS: 002b
-
These third party modules are not supported by Red Hat.
crash> mod -t NAME TAINTS dsa_filter_hook OE oracleasm O mfe_fileaccess_1007161049 OE mfe_aac_1007161049 OE dsa_filter POE
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments