High %sys CPU usage and an uptick in the load when 3rd party modules from TrendMicro are in use

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 7
  • Trend Micro Deep Security Agent (ds_agent.service)
  • Trend Micro Deep Security module [acdc]

Issue

  • The top command shows that the high CPU utilization is caused by the user-space processes sa1 and run-parts.
  • The high CPU utilization in the kernel space (%system) causing slowness and an uptick in the load.
12:00:02 AM     CPU     %user     %nice    %system    %iowait   %steal    %idle
12:10:01 AM     all      0.61      0.00     99.39      0.00      0.00      0.00
12:20:02 AM     all      0.50      0.00     99.50      0.00      0.00      0.00
12:30:01 AM     all      0.67      0.00     99.33      0.00      0.00      0.00

12:00:02 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
12:10:01 AM        33       576     32.23     32.20     32.21         0
12:20:02 AM        33       577     32.14     32.21     32.22         0
12:30:01 AM        34       576     32.20     32.35     32.35         0

Resolution

  • Engage the Trend Micro support team for further assistance.
  • Check for updates or patches to fix this issue.

Workaround

  • Blacklist the Trend Micro Deep Security module [acdc].
  • Stop and disable the Trend Micro Deep Security Agent service (ds_agent).
# systemctl stop ds_agent.service 
# systemctl disable ds_agent.service 

Root Cause

  • The high CPU utilization in the kernel-space (%system) is due to spinlock contention caused by the Trend Micro Deep Security module [acdc].
# perf report -i perf.data --show-cpu-utilization --show-nr-samples --max-stack 0 -d "[kernel.kallsyms]" --percent-limit 1 --stdio 
# To display the perf.data header info, please use --header/--header-only options.
#
# dso: [kernel.kallsyms]
#
# Total Lost Samples: 0
#
# Samples: 489K of event 'cpu-clock'
# Event count (approx.): 122333500000
#
# Children      Self       sys       usr       Samples  Command          Symbol                                  
# ........  ........  ........  ........  ............  ...............  ........................................
#
    48.09%    48.09%    48.09%     0.00%        235303  sa1              [k] _raw_spin_unlock_irqrestore
    19.23%    19.23%    19.23%     0.00%         94100  sa1              [k] finish_task_switch
     6.64%     6.64%     6.64%     0.00%         32512  run-parts        [k] _raw_spin_unlock_irqrestore
     4.81%     4.81%     4.81%     0.00%         23524  virt-what        [k] _raw_spin_unlock_irqrestore
     2.49%     2.49%     2.49%     0.00%         12172  run-parts        [k] finish_task_switch
     2.43%     2.43%     2.43%     0.00%         11888  SapphireIMSAgen  [k] _raw_spin_unlock_irqrestore
     2.39%     2.39%     2.39%     0.00%         11707  0anacron         [k] _raw_spin_unlock_irqrestore
     2.07%     2.07%     2.07%     0.00%         10122  ksmtuned         [k] _raw_spin_unlock_irqrestore
     1.84%     1.84%     1.84%     0.00%          9011  sudo             [k] _raw_spin_unlock_irqrestore
     1.75%     1.75%     1.75%     0.00%          8545  bash             [k] _raw_spin_unlock_irqrestore
     1.36%     1.36%     1.36%     0.00%          6633  virt-what        [k] finish_task_switch
     1.33%     1.33%     1.33%     0.00%          6484  bash             [k] finish_task_switch
     1.23%     1.23%     1.23%     0.00%          6008  sudo             [k] finish_task_switch
     1.02%     1.02%     1.02%     0.00%          4971  ksmtuned         [k] finish_task_switch

# perf report --sort=comm --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 489K of event 'cpu-clock'
# Event count (approx.): 122333500000
#
# Children      Self  Command        
# ........  ........  ...............
#
    68.59%    68.59%  sa1            
            |
            ---__GI___libc_close
               system_call_fastpath
               gsch_close_hook_fn
               ac_hook_sys_close
               __pre_event
               |          
                --68.53%--notify_usermode
                          |          
                          |--31.67%--prepare_to_wait
                          |          |          
                          |           --31.60%--_raw_spin_unlock_irqrestore
                          |          
                          |--19.93%--schedule
                          |          |          
                          |           --19.86%--__schedule
                          |                     |          
                          |                      --19.23%--finish_task_switch
                          |          
                           --16.47%--finish_wait
                                     |          
                                      --16.43%--_raw_spin_unlock_irqrestore

     9.30%     9.30%  run-parts      
            |
            ---__GI___libc_close
               system_call_fastpath
               gsch_close_hook_fn
               ac_hook_sys_close
               __pre_event
               |          
                --9.29%--notify_usermode
                          |          
                          |--4.42%--prepare_to_wait
                          |          |          
                          |           --4.41%--_raw_spin_unlock_irqrestore
                          |          
                          |--2.58%--schedule
                          |          |          
                          |           --2.57%--__schedule
                          |                     |          
                          |                      --2.49%--finish_task_switch
                          |          
                           --2.23%--finish_wait
                                     |          
                                      --2.22%--_raw_spin_unlock_irqrestore

     6.27%     6.27%  virt-what      
            |
            ---__GI___libc_close
               system_call_fastpath
               gsch_close_hook_fn
               ac_hook_sys_close
               __pre_event
               |          
                --6.27%--notify_usermode
                          |          
                          |--3.15%--prepare_to_wait
                          |          |          
                          |           --3.14%--_raw_spin_unlock_irqrestore
                          |          
                          |--1.66%--finish_wait
                          |          |          
                          |           --1.66%--_raw_spin_unlock_irqrestore
                          |          
                           --1.42%--schedule
                                     |          
                                      --1.41%--__schedule
                                                |          
                                                 --1.36%--finish_task_switch

  • The 'sa1' tasks were trying to acquire a spinlock in the function prepare_to_wait()/finish_wait() called by the notify_usermode() function.
# perf script
...
sa1 14568 [003] 476233.551035:     250000 cpu-clock: 
        ffffffffb0588731 _raw_spin_lock_irqsave+0x11 ([kernel.kallsyms])
        ffffffffafec73b8 finish_wait+0x38 ([kernel.kallsyms])
        ffffffffc09b2363 notify_usermode+0x163 ([acdc])
        ffffffffc09b2798 __pre_event+0x38 ([acdc])
        ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
        ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
        ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
            7feca7501040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)
...
sa1 14568 [003] 476234.192034:     250000 cpu-clock: 
        ffffffffb058872b _raw_spin_lock_irqsave+0xb ([kernel.kallsyms])
        ffffffffafec7247 prepare_to_wait+0x27 ([kernel.kallsyms])
        ffffffffc09b230f notify_usermode+0x10f ([acdc])
        ffffffffc09b2798 __pre_event+0x38 ([acdc])
        ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
        ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
        ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
            7feca7501040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)

...
sa1 23689 [003] 476233.413786:     250000 cpu-clock: 
        ffffffffb05883c5 _raw_spin_unlock_irqrestore+0x15 ([kernel.kallsyms])
        ffffffffafec7276 prepare_to_wait+0x56 ([kernel.kallsyms])
        ffffffffc09b230f notify_usermode+0x10f ([acdc])
        ffffffffc09b2798 __pre_event+0x38 ([acdc])
        ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
        ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
        ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
            7fc39ebaa040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)
...
sa1 23689 [003] 476233.414787:     250000 cpu-clock: 
        ffffffffb05883c5 _raw_spin_unlock_irqrestore+0x15 ([kernel.kallsyms])
        ffffffffafec73d6 finish_wait+0x56 ([kernel.kallsyms])
        ffffffffc09b2363 notify_usermode+0x163 ([acdc])
        ffffffffc09b2798 __pre_event+0x38 ([acdc])
        ffffffffc09b2fdb ac_hook_sys_close+0x5b ([acdc])
        ffffffffc09f2387 gsch_close_hook_fn+0x57 ([gsch])
        ffffffffb0592ed2 system_call_fastpath+0x25 ([kernel.kallsyms])
            7fc39ebaa040 __GI___libc_close+0x10 (/usr/lib64/libc-2.17.so)

Kernel Source: kernel/wait.c
..
69 prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state)
70 {
71         unsigned long flags;
72 
73         wait->flags &= ~WQ_FLAG_EXCLUSIVE;
74         spin_lock_irqsave(&q->lock, flags);
75         if (list_empty(&wait->task_list))
76                 __add_wait_queue(q, wait);
77         set_current_state(state);
78         spin_unlock_irqrestore(&q->lock, flags);
..

Kernel Source: kernel/wait.c
...
105 void finish_wait(wait_queue_head_t *q, wait_queue_t *wait)
106 {
...    
123         if (!list_empty_careful(&wait->task_list)) {
124                 spin_lock_irqsave(&q->lock, flags);
125                 list_del_init(&wait->task_list);
126                 spin_unlock_irqrestore(&q->lock, flags);
...
  • The notify_usermode() function is the part of an unsigned (E) kernel module [acdc].
$ grep notify_usermode /proc/kallsyms 
ffffffffc09b2200 t notify_usermode  [acdc]

$ grep acdc /proc/modules 
acdc 146194 40 - Live 0xffffffffc09ad000 (POE)
  • The kernel syscall table is modified by the kernel module [acdc].
$ grep acdc proc/kallsyms  | grep ac_hook | column -t
ffffffffc09ad000  t  ac_hook_ptregs_sys_execve        [acdc]
ffffffffc09ad00e  t  ac_hook_stub_execve              [acdc]
ffffffffc09ad039  t  ac_hook_ptregs_sys_execveat      [acdc]
ffffffffc09ad047  t  ac_hook_stub_execveat            [acdc]
ffffffffc09ad072  t  ac_hook_ptregs_sys_x32_execve    [acdc]
ffffffffc09ad080  t  ac_hook_stub_x32_execve          [acdc]
ffffffffc09ad0ab  t  ac_hook_ptregs_sys_x32_execveat  [acdc]
ffffffffc09ad0b9  t  ac_hook_stub_x32_execveat        [acdc]
ffffffffc09addb0  t  ac_hook_disabled                 [acdc]
ffffffffc09b470e  t  ac_hook_state_update.part.8      [acdc]
ffffffffc09b3560  t  ac_hook_sys_renameat             [acdc]
ffffffffc09b3070  t  ac_hook_sys_chmod                [acdc]
ffffffffc09b3af0  t  ac_hook_sys_fchown               [acdc]
ffffffffc09b2950  t  ac_hook_sys_dup2                 [acdc]
ffffffffc09b31a0  t  ac_hook_sys_fchmod               [acdc]
ffffffffc09b3fc0  t  ac_hook_sys_mmap                 [acdc]
ffffffffc09b3e80  t  ac_hook_sys_unlinkat             [acdc]
ffffffffc09b2a70  t  ac_hook_sys_dup3                 [acdc]
ffffffffc09b2f80  t  ac_hook_sys_close                [acdc]
ffffffffc09b0a60  t  ac_hook_read_unlock              [acdc]
ffffffffc09b3850  t  ac_hook_sys_chown                [acdc]
ffffffffc09b45f0  t  ac_hook_sys_x32_execveat         [acdc]
ffffffffc09b2830  t  ac_hook_sys_dup                  [acdc]
ffffffffc09b2ce0  t  ac_hook_sys_openat               [acdc]
ffffffffc09b2b90  t  ac_hook_sys_open                 [acdc]
ffffffffc09b39a0  t  ac_hook_sys_lchown               [acdc]
ffffffffc09b0a40  t  ac_hook_read_lock                [acdc]
ffffffffc09b32c0  t  ac_hook_sys_fchmodat             [acdc]
ffffffffc09b3c00  t  ac_hook_sys_fchownat             [acdc]
ffffffffc09b2e40  t  ac_hook_sys_creat                [acdc]
ffffffffc09b4410  t  ac_hook_sys_execveat             [acdc]
ffffffffc09b36d0  t  ac_hook_sys_renameat2            [acdc]
ffffffffc09b3400  t  ac_hook_sys_rename               [acdc]
ffffffffc09b3d50  t  ac_hook_sys_unlink               [acdc]
ffffffffc09b4320  t  ac_hook_sys_execve               [acdc]
ffffffffc09b4500  t  ac_hook_sys_x32_execve           [acdc]

Diagnostic Steps

  • Capture performance data with the following perf command while an issue occurs:
# perf record -agT -- sleep 10
# perf archive

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments