Why system shows high number of context switching and interrupt rate?

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux

Issue

  • Observed high number of context switching and interrupt rate on Linux box, is this a cause of concern?
10:45:02 AM    proc/s   cswch/s
10:45:03 AM   7461.86 162656.70
10:45:04 AM   7577.08 165451.04
10:45:05 AM   7269.07 158628.87
10:45:06 AM   7202.02 156147.47
10:45:07 AM   6997.96 150135.71
10:45:08 AM   5878.43 129769.61
10:45:09 AM      0.00   2238.38
10:45:10 AM      1.00   1753.00
10:45:11 AM      0.00   1659.00
10:45:12 AM      1.02   1956.12
10:45:13 AM   1472.55  29550.00
10:45:14 AM   7503.09 164700.00
10:45:15 AM   7564.95 163741.24
10:45:16 AM   7130.00 154742.00
10:45:17 AM   7367.01 162021.65
10:45:18 AM   6876.24 147852.48
10:45:19 AM   6965.69 150706.86
10:45:20 AM   6059.38 135597.92
10:45:21 AM      6.06   2325.25
10:45:22 AM   5360.20 118755.10
10:45:23 AM   7123.76 158248.51
10:45:24 AM   6091.92 133512.12
10:45:25 AM   7167.00 156230.00
10:45:26 AM   6929.70 152298.02
10:45:27 AM   7541.24 166132.99
10:45:28 AM   7544.33 165311.34
10:45:29 AM    328.28  10556.57
10:45:30 AM      1.00   3835.00
10:45:31 AM      9.00   3728.00
10:45:32 AM      0.00   3266.67
10:45:33 AM   1000.00  32036.36
10:45:34 AM   6616.16 151763.64
10:45:35 AM   7281.00 158306.00

Resolution

  • A context switch is described as the kernel suspending execution of one process on the CPU and resuming execution of some other process that had previously been suspended. A context switch is required for every interrupt and every task that the scheduler picks.

  • Context switching can be due to multitasking, Interrupt handling , user & kernel mode switching. The interrupt rate will naturally go high, if there is higher network traffic, or higher disk traffic. Also it is dependent on the application which every now and then invoking system calls.

  • If the cores/CPU's are not sufficient to handle load of threads created by application will also result in context switching.

  • It is not a cause of concern until performance breaks down. This is expected that CPU will do context switching. One shouldn't verify these data at first place since there are many statistical data which should be analyzed prior to looking into kernel activities. Verify the CPU, memory and network usage during this time. Sar utility will provide these data.

Diagnostic Steps

  • Collect following output to check which process is causing issue :
# pidstat -w 3 10   > /tmp/pidstat.out

10:15:24 AM     UID     PID     cswch/s         nvcswch/s       Command 
10:15:27 AM     0       1       162656.7        16656.7         systemd
10:15:27 AM     0       9       165451.04       15451.04        ksoftirqd/0
10:15:27 AM     0       10      158628.87       15828.87        rcu_sched
10:15:27 AM     0       11      156147.47       15647.47        migration/0
10:15:27 AM     0       17      150135.71       15035.71        ksoftirqd/1
10:15:27 AM     0       23      129769.61       12979.61        ksoftirqd/2
10:15:27 AM     0       29      2238.38         238.38          ksoftirqd/3
10:15:27 AM     0       43      1753            753             khugepaged
10:15:27 AM     0       443     1659            165             usb-storage
10:15:27 AM     0       456     1956.12         156.12          i915/signal:0
10:15:27 AM     0       465     29550           29550           kworker/3:1H-xfs-log/dm-3
10:15:27 AM     0       490     164700          14700           kworker/0:1H-kblockd
10:15:27 AM     0       506     163741.24       16741.24        kworker/1:1H-xfs-log/dm-3
10:15:27 AM     0       594     154742          154742          dmcrypt_write/2
10:15:27 AM     0       629     162021.65       16021.65        kworker/2:1H-kblockd
10:15:27 AM     0       715     147852.48       14852.48        xfsaild/dm-1
10:15:27 AM     0       886     150706.86       15706.86        irq/131-iwlwifi
10:15:27 AM     0       966     135597.92       13597.92        xfsaild/dm-3
10:15:27 AM     81      1037    2325.25         225.25          dbus-daemon
10:15:27 AM     998     1052    118755.1        11755.1         polkitd
10:15:27 AM     70      1056    158248.51       15848.51        avahi-daemon
10:15:27 AM     0       1061    133512.12       455.12          rngd
10:15:27 AM     0       1110    156230          16230           cupsd
10:15:27 AM     0       1192    152298.02       1598.02         sssd_nss
10:15:27 AM     0       1247    166132.99       16632.99        systemd-logind
10:15:27 AM     0       1265    165311.34       16511.34        cups-browsed
10:15:27 AM     0       1408    10556.57        1556.57         wpa_supplicant
10:15:27 AM     0       1687    3835            3835            splunkd
10:15:27 AM     42      1773    3728            3728            Xorg
10:15:27 AM     42      1996    3266.67         266.67          gsd-color
10:15:27 AM     0       3166    32036.36        3036.36         sssd_kcm
10:15:27 AM     119349  3194    151763.64       11763.64        dbus-daemon
10:15:27 AM     119349  3199    158306          18306           Xorg
10:15:27 AM     119349  3242    15.28           5.8             gnome-shell

# pidstat -wt 3 10  > /tmp/pidstat-t.out

Linux 4.18.0-80.11.2.el8_0.x86_64 (hostname)    09/08/2020  _x86_64_    (4 CPU)

10:15:15 AM   UID      TGID       TID   cswch/s   nvcswch/s  Command
10:15:19 AM     0         1         -   152656.7   16656.7   systemd
10:15:19 AM     0         -         1   152656.7   16656.7   |__systemd
10:15:19 AM     0         9         -   165451.04  15451.04  ksoftirqd/0
10:15:19 AM     0         -         9   165451.04  15451.04  |__ksoftirqd/0
10:15:19 AM     0        10         -   158628.87  15828.87  rcu_sched
10:15:19 AM     0         -        10   158628.87  15828.87  |__rcu_sched
10:15:19 AM     0        23         -   129769.61  12979.61  ksoftirqd/2
10:15:19 AM     0         -        23   129769.61  12979.33  |__ksoftirqd/2
10:15:19 AM     0        29         -   32424.5    2445      ksoftirqd/3
10:15:19 AM     0         -        29   32424.5    2445      |__ksoftirqd/3
10:15:19 AM     0        43         -   334        34        khugepaged
10:15:19 AM     0         -        43   334        34        |__khugepaged
10:15:19 AM     0       443         -   11465      566       usb-storage
10:15:19 AM     0         -       443   6433       93        |__usb-storage
10:15:19 AM     0       456         -   15.41      0.00      i915/signal:0
10:15:19 AM     0         -       456   15.41      0.00      |__i915/signal:0
10:15:19 AM     0       715         -   19.34      0.00      xfsaild/dm-1
10:15:19 AM     0         -       715   19.34      0.00      |__xfsaild/dm-1
10:15:19 AM     0       886         -   23.28      0.00      irq/131-iwlwifi
10:15:19 AM     0         -       886   23.28      0.00      |__irq/131-iwlwifi
10:15:19 AM     0       966         -   19.67      0.00      xfsaild/dm-3
10:15:19 AM     0         -       966   19.67      0.00      |__xfsaild/dm-3
10:15:19 AM    81      1037         -   6.89       0.33      dbus-daemon
10:15:19 AM    81         -      1037   6.89       0.33      |__dbus-daemon
10:15:19 AM     0      1038         -   11567.31   4436      NetworkManager
10:15:19 AM     0         -      1038   1.31       0.00      |__NetworkManager
10:15:19 AM     0         -      1088   0.33       0.00      |__gmain
10:15:19 AM     0         -      1094   1340.66    0.00      |__gdbus
10:15:19 AM   998      1052         -   118755.1   11755.1   polkitd
10:15:19 AM   998         -      1052   32420.66   25545     |__polkitd
10:15:19 AM   998         -      1132   0.66       0.00      |__gdbus

Then with help of PID which is causing issue, one can get all system calls details:

# strace -c -f -p <pid of process/thread>

Let this command run for a few minutes while the load/context switch rates are high. It is safe to run this on a production system so you could run it on a good system as well to provide a comparative baseline. Through strace, one can debug & troubleshoot the issue, by looking at system calls the process has made.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments