The server is getting hung up. Lots of tasks are calling fanotify_handle_event() and waiting for permission event response from userspace. The server crashes if hung_task_panic is enabled. Otherwise a hard reboot is the only resolution.
Issue
-
The server is getting hung up.
-
The server crashes if hung_task_panic is enabled. Otherwise a hard reboot is the only resolution once the server gets hung up completely.
-
Blocked task message observed in kernel ring buffer:
[13919.504321] INFO: task fsbspamd:19047 blocked for more than 120 seconds.
[13919.504396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13919.504433] fsbspamd D ffff999c7fc1acc0 0 19047 1246 0x00000080
[13919.504458] Call Trace:
[13919.504595] [<ffffffffb0b85da9>] schedule+0x29/0x70
[13919.504668] [<ffffffffb0699d55>] fanotify_handle_event+0x265/0x3a0
[13919.504704] [<ffffffffb0acd65c>] ? udp_send_skb+0xac/0x2b0
[13919.504749] [<ffffffffb04c7780>] ? wake_up_atomic_t+0x30/0x30
[13919.504754] [<ffffffffb0696128>] fsnotify+0x388/0x460
[13919.504786] [<ffffffffb0707f3e>] security_file_open+0x6e/0x70
[13919.504822] [<ffffffffb064a7b9>] do_dentry_open+0xc9/0x2d0
[13919.504829] [<ffffffffb0707992>] ? security_inode_permission+0x22/0x30
[13919.504834] [<ffffffffb064aa5a>] vfs_open+0x5a/0xb0
[13919.504853] [<ffffffffb06590da>] ? may_open+0x5a/0x120
[13919.504860] [<ffffffffb065d006>] do_last+0x1f6/0x1340
[13919.504866] [<ffffffffb065e21d>] path_openat+0xcd/0x5a0
[13919.504872] [<ffffffffb066046d>] do_filp_open+0x4d/0xb0
[13919.504884] [<ffffffffb066e512>] ? __alloc_fd+0xc2/0x170
[13919.504890] [<ffffffffb064c044>] do_sys_open+0x124/0x220
[13919.504896] [<ffffffffb064c15e>] SyS_open+0x1e/0x20
[13919.504910] [<ffffffffb0b92ed2>] system_call_fastpath+0x25/0x2a
-
Lots of tasks are calling fanotify_handle_event(), waiting for permission event response from userspace, and getting stuck in TASK_UNINTERRUPTIBLE sleep indefinitely.
-
At this time, a task, that is responsible for responding with permission event response back to the tasks, itself is waiting for permission event response from userspace as well, which is a deadlock condition caused by userspace bug.
Environment
- Red Hat Enterprise Linux 7.9
-
Red Hat Enterprise Linux 8.5
-
VMware Endpoint Security Solution (vsep/pool) which was formerly vShield Endpoint that works in conjunction with various security softwares (e.g. Trend Micro, McAfee, Symantec)
- F-Secure (fsavd/fsaccd)
- Microsoft Defender for Linux (wdavdaemon)
- Kaspersky (kesl)
- Twistlock Defender (one of the certified cloud native security for OpenShift operators)
- Cybereason Sensor for Linux
- TrendMicro Deep Security Agent for Linux
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.