High %sys CPU usage and an uptick in the load when 3rd party modules from TrendMicro are in use
Environment
- Red Hat Enterprise Linux 7
- Trend Micro Deep Security Agent (
ds_agent.service
) - Trend Micro Deep Security module [
acdc
]
Issue
- The
top
command shows that the high CPU utilization is caused by the user-space processessa1
andrun-parts
. - The high CPU utilization in the kernel space (
%system
) causing slowness and an uptick in the load.
12:00:02 AM CPU %user %nice %system %iowait %steal %idle
12:10:01 AM all 0.61 0.00 99.39 0.00 0.00 0.00
12:20:02 AM all 0.50 0.00 99.50 0.00 0.00 0.00
12:30:01 AM all 0.67 0.00 99.33 0.00 0.00 0.00
12:00:02 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked
12:10:01 AM 33 576 32.23 32.20 32.21 0
12:20:02 AM 33 577 32.14 32.21 32.22 0
12:30:01 AM 34 576 32.20 32.35 32.35 0
Resolution
- Engage the Trend Micro support team for further assistance.
- Check for updates or patches to fix this issue.
Workaround
- Blacklist the Trend Micro Deep Security module [
acdc
]. - Stop and disable the Trend Micro Deep Security Agent service (
ds_agent
).
# systemctl stop ds_agent.service
# systemctl disable ds_agent.service
Root Cause
- The high CPU utilization in the kernel-space (
%system
) is due to spinlock contention caused by the Trend Micro Deep Security module [acdc
].
# perf report -i perf.data --show-cpu-utilization --show-nr-samples --max-stack 0 -d "[kernel.kallsyms]" --percent-limit 1 --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# dso: [kernel.kallsyms]
#
# Total Lost Samples: 0
#
# Samples: 489K of event 'cpu-clock'
# Event count (approx.): 122333500000
#
# Children Self sys usr Samples Command Symbol
# ........ ........ ........ ........ ............ ............... ........................................
#
48.09% 48.09% 48.09% 0.00% 235303 sa1 [k] _raw_spin_unlock_irqrestore
19.23% 19.23% 19.23% 0.00% 94100 sa1 [k] finish_task_switch
6.64% 6.64% 6.64% 0.00% 32512 run-parts [k] _raw_spin_unlock_irqrestore
4.81% 4.81% 4.81% 0.00% 23524 virt-what [k] _raw_spin_unlock_irqrestore
2.49% 2.49% 2.49% 0.00% 12172 run-parts [k] finish_task_switch
2.43% 2.43% 2.43% 0.00% 11888 SapphireIMSAgen [k] _raw_spin_unlock_irqrestore
2.39% 2.39% 2.39% 0.00% 11707 0anacron [k] _raw_spin_unlock_irqrestore
2.07% 2.07% 2.07% 0.00% 10122 ksmtuned [k] _raw_spin_unlock_irqrestore
1.84% 1.84% 1.84% 0.00% 9011 sudo [k] _raw_spin_unlock_irqrestore
1.75% 1.75% 1.75% 0.00% 8545 bash [k] _raw_spin_unlock_irqrestore
1.36% 1.36% 1.36% 0.00% 6633 virt-what [k] finish_task_switch
1.33% 1.33% 1.33% 0.00% 6484 bash [k] finish_task_switch
1.23% 1.23% 1.23% 0.00% 6008 sudo [k] finish_task_switch
1.02% 1.02% 1.02% 0.00% 4971 ksmtuned [k] finish_task_switch
# perf report --sort=comm --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 489K of event 'cpu-clock'
# Event count (approx.): 122333500000
#
# Children Self Command
# ........ ........ ...............
#
68.59% 68.59% sa1
|
---__GI___libc_close
system_call_fastpath
gsch_close_hook_fn
ac_hook_sys_close
__pre_event
|
--68.53%--notify_usermode
|
|--31.67%--prepare_to_wait
| |
| --31.60%--_raw_spin_unlock_irqrestore
|
|--19.93%--schedule
| |
| --19.86%--__schedule
| |
| --19.23%--finish_task_switch
|
--16.47%--finish_wait
|
--16.43%--_raw_spin_unlock_irqrestore
9.30% 9.30% run-parts
|
---__GI___libc_close
system_call_fastpath
gsch_close_hook_fn
ac_hook_sys_close
__pre_event
|
--9.29%--notify_usermode
|
|--4.42%--prepare_to_wait
| |
| --4.41%--_raw_spin_unlock_irqrestore
|
|--2.58%--schedule
| |
| --2.57%--__schedule
| |
| --2.49%--finish_task_switch
|
--2.23%--finish_wait
|
--2.22%--_raw_spin_unlock_irqrestore
6.27% 6.27% virt-what
|
---__GI___libc_close
system_call_fastpath
gsch_close_hook_fn
ac_hook_sys_close
__pre_event
|
--6.27%--notify_usermode
|
|--3.15%--prepare_to_wait
| |
| --3.14%--_raw_spin_unlock_irqrestore
|
|--1.66%--finish_wait
| |
| --1.66%--_raw_spin_unlock_irqrestore
|
--1.42%--schedule
|
--1.41%--__schedule
|
--1.36%--finish_task_switch
- The 'sa1' tasks were trying to acquire a spinlock in the function
prepare_to_wait()
/finish_wait()
called by thenotify_usermode()
function.
# perf script
...
sa1 14568 [003] 476233.551035: 250000 cpu-clock:
ffffffffb0588731 _raw_spin_lock_irqsave+0x11 ([kernel.kallsyms])
ffffffffafec73b8 finish_wait+0x38 ([kernel.kallsyms])
ffffffffc09b2363 notify_usermode+0x163 ([acdc])
ffffffffc09b2798 __pre_event+0x38 ([acdc])
ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
7feca7501040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)
...
sa1 14568 [003] 476234.192034: 250000 cpu-clock:
ffffffffb058872b _raw_spin_lock_irqsave+0xb ([kernel.kallsyms])
ffffffffafec7247 prepare_to_wait+0x27 ([kernel.kallsyms])
ffffffffc09b230f notify_usermode+0x10f ([acdc])
ffffffffc09b2798 __pre_event+0x38 ([acdc])
ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
7feca7501040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)
...
sa1 23689 [003] 476233.413786: 250000 cpu-clock:
ffffffffb05883c5 _raw_spin_unlock_irqrestore+0x15 ([kernel.kallsyms])
ffffffffafec7276 prepare_to_wait+0x56 ([kernel.kallsyms])
ffffffffc09b230f notify_usermode+0x10f ([acdc])
ffffffffc09b2798 __pre_event+0x38 ([acdc])
ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
7fc39ebaa040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)
...
sa1 23689 [003] 476233.414787: 250000 cpu-clock:
ffffffffb05883c5 _raw_spin_unlock_irqrestore+0x15 ([kernel.kallsyms])
ffffffffafec73d6 finish_wait+0x56 ([kernel.kallsyms])
ffffffffc09b2363 notify_usermode+0x163 ([acdc])
ffffffffc09b2798 __pre_event+0x38 ([acdc])
ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
7fc39ebaa040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)
Kernel Source: kernel/wait.c
..
69 prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state)
70 {
71 unsigned long flags;
72
73 wait->flags &= ~WQ_FLAG_EXCLUSIVE;
74 spin_lock_irqsave(&q->lock, flags);
75 if (list_empty(&wait->task_list))
76 __add_wait_queue(q, wait);
77 set_current_state(state);
78 spin_unlock_irqrestore(&q->lock, flags);
..
Kernel Source: kernel/wait.c
...
105 void finish_wait(wait_queue_head_t *q, wait_queue_t *wait)
106 {
...
123 if (!list_empty_careful(&wait->task_list)) {
124 spin_lock_irqsave(&q->lock, flags);
125 list_del_init(&wait->task_list);
126 spin_unlock_irqrestore(&q->lock, flags);
...
- The
notify_usermode()
function is the part of an unsigned (E) kernel module [acdc
].
$ grep notify_usermode /proc/kallsyms
ffffffffc09b2200 t notify_usermode [acdc]
$ grep acdc /proc/modules
acdc 146194 40 - Live 0xffffffffc09ad000 (POE)
- The kernel syscall table is modified by the kernel module [
acdc
].
$ grep acdc proc/kallsyms | grep ac_hook | column -t
ffffffffc09ad000 t ac_hook_ptregs_sys_execve [acdc]
ffffffffc09ad00e t ac_hook_stub_execve [acdc]
ffffffffc09ad039 t ac_hook_ptregs_sys_execveat [acdc]
ffffffffc09ad047 t ac_hook_stub_execveat [acdc]
ffffffffc09ad072 t ac_hook_ptregs_sys_x32_execve [acdc]
ffffffffc09ad080 t ac_hook_stub_x32_execve [acdc]
ffffffffc09ad0ab t ac_hook_ptregs_sys_x32_execveat [acdc]
ffffffffc09ad0b9 t ac_hook_stub_x32_execveat [acdc]
ffffffffc09addb0 t ac_hook_disabled [acdc]
ffffffffc09b470e t ac_hook_state_update.part.8 [acdc]
ffffffffc09b3560 t ac_hook_sys_renameat [acdc]
ffffffffc09b3070 t ac_hook_sys_chmod [acdc]
ffffffffc09b3af0 t ac_hook_sys_fchown [acdc]
ffffffffc09b2950 t ac_hook_sys_dup2 [acdc]
ffffffffc09b31a0 t ac_hook_sys_fchmod [acdc]
ffffffffc09b3fc0 t ac_hook_sys_mmap [acdc]
ffffffffc09b3e80 t ac_hook_sys_unlinkat [acdc]
ffffffffc09b2a70 t ac_hook_sys_dup3 [acdc]
ffffffffc09b2f80 t ac_hook_sys_close [acdc]
ffffffffc09b0a60 t ac_hook_read_unlock [acdc]
ffffffffc09b3850 t ac_hook_sys_chown [acdc]
ffffffffc09b45f0 t ac_hook_sys_x32_execveat [acdc]
ffffffffc09b2830 t ac_hook_sys_dup [acdc]
ffffffffc09b2ce0 t ac_hook_sys_openat [acdc]
ffffffffc09b2b90 t ac_hook_sys_open [acdc]
ffffffffc09b39a0 t ac_hook_sys_lchown [acdc]
ffffffffc09b0a40 t ac_hook_read_lock [acdc]
ffffffffc09b32c0 t ac_hook_sys_fchmodat [acdc]
ffffffffc09b3c00 t ac_hook_sys_fchownat [acdc]
ffffffffc09b2e40 t ac_hook_sys_creat [acdc]
ffffffffc09b4410 t ac_hook_sys_execveat [acdc]
ffffffffc09b36d0 t ac_hook_sys_renameat2 [acdc]
ffffffffc09b3400 t ac_hook_sys_rename [acdc]
ffffffffc09b3d50 t ac_hook_sys_unlink [acdc]
ffffffffc09b4320 t ac_hook_sys_execve [acdc]
ffffffffc09b4500 t ac_hook_sys_x32_execve [acdc]
Diagnostic Steps
- Capture performance data with the following perf command while an issue occurs:
# perf record -agT -- sleep 10
# perf archive
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments