Why system shows high number of context switching and interrupt rate?
Environment
- Red Hat Enterprise Linux
Issue
- Observed high number of context switching and interrupt rate on Linux box, is this a cause of concern?
10:45:02 AM proc/s cswch/s
10:45:03 AM 7461.86 162656.70
10:45:04 AM 7577.08 165451.04
10:45:05 AM 7269.07 158628.87
10:45:06 AM 7202.02 156147.47
10:45:07 AM 6997.96 150135.71
10:45:08 AM 5878.43 129769.61
10:45:09 AM 0.00 2238.38
10:45:10 AM 1.00 1753.00
10:45:11 AM 0.00 1659.00
10:45:12 AM 1.02 1956.12
10:45:13 AM 1472.55 29550.00
10:45:14 AM 7503.09 164700.00
10:45:15 AM 7564.95 163741.24
10:45:16 AM 7130.00 154742.00
10:45:17 AM 7367.01 162021.65
10:45:18 AM 6876.24 147852.48
10:45:19 AM 6965.69 150706.86
10:45:20 AM 6059.38 135597.92
10:45:21 AM 6.06 2325.25
10:45:22 AM 5360.20 118755.10
10:45:23 AM 7123.76 158248.51
10:45:24 AM 6091.92 133512.12
10:45:25 AM 7167.00 156230.00
10:45:26 AM 6929.70 152298.02
10:45:27 AM 7541.24 166132.99
10:45:28 AM 7544.33 165311.34
10:45:29 AM 328.28 10556.57
10:45:30 AM 1.00 3835.00
10:45:31 AM 9.00 3728.00
10:45:32 AM 0.00 3266.67
10:45:33 AM 1000.00 32036.36
10:45:34 AM 6616.16 151763.64
10:45:35 AM 7281.00 158306.00
Resolution
-
A context switch is described as the kernel suspending execution of one process on the CPU and resuming execution of some other process that had previously been suspended. A context switch is required for every interrupt and every task that the scheduler picks.
-
Context switching can be due to multitasking, Interrupt handling , user & kernel mode switching. The interrupt rate will naturally go high, if there is higher network traffic, or higher disk traffic. Also it is dependent on the application which every now and then invoking system calls.
-
If the cores/CPU's are not sufficient to handle load of threads created by application will also result in context switching.
-
It is not a cause of concern until performance breaks down. This is expected that CPU will do context switching. One shouldn't verify these data at first place since there are many statistical data which should be analyzed prior to looking into kernel activities. Verify the CPU, memory and network usage during this time. Sar utility will provide these data.
Diagnostic Steps
- Collect following output to check which process is causing issue :
# pidstat -w 3 10 > /tmp/pidstat.out
10:15:24 AM UID PID cswch/s nvcswch/s Command
10:15:27 AM 0 1 162656.7 16656.7 systemd
10:15:27 AM 0 9 165451.04 15451.04 ksoftirqd/0
10:15:27 AM 0 10 158628.87 15828.87 rcu_sched
10:15:27 AM 0 11 156147.47 15647.47 migration/0
10:15:27 AM 0 17 150135.71 15035.71 ksoftirqd/1
10:15:27 AM 0 23 129769.61 12979.61 ksoftirqd/2
10:15:27 AM 0 29 2238.38 238.38 ksoftirqd/3
10:15:27 AM 0 43 1753 753 khugepaged
10:15:27 AM 0 443 1659 165 usb-storage
10:15:27 AM 0 456 1956.12 156.12 i915/signal:0
10:15:27 AM 0 465 29550 29550 kworker/3:1H-xfs-log/dm-3
10:15:27 AM 0 490 164700 14700 kworker/0:1H-kblockd
10:15:27 AM 0 506 163741.24 16741.24 kworker/1:1H-xfs-log/dm-3
10:15:27 AM 0 594 154742 154742 dmcrypt_write/2
10:15:27 AM 0 629 162021.65 16021.65 kworker/2:1H-kblockd
10:15:27 AM 0 715 147852.48 14852.48 xfsaild/dm-1
10:15:27 AM 0 886 150706.86 15706.86 irq/131-iwlwifi
10:15:27 AM 0 966 135597.92 13597.92 xfsaild/dm-3
10:15:27 AM 81 1037 2325.25 225.25 dbus-daemon
10:15:27 AM 998 1052 118755.1 11755.1 polkitd
10:15:27 AM 70 1056 158248.51 15848.51 avahi-daemon
10:15:27 AM 0 1061 133512.12 455.12 rngd
10:15:27 AM 0 1110 156230 16230 cupsd
10:15:27 AM 0 1192 152298.02 1598.02 sssd_nss
10:15:27 AM 0 1247 166132.99 16632.99 systemd-logind
10:15:27 AM 0 1265 165311.34 16511.34 cups-browsed
10:15:27 AM 0 1408 10556.57 1556.57 wpa_supplicant
10:15:27 AM 0 1687 3835 3835 splunkd
10:15:27 AM 42 1773 3728 3728 Xorg
10:15:27 AM 42 1996 3266.67 266.67 gsd-color
10:15:27 AM 0 3166 32036.36 3036.36 sssd_kcm
10:15:27 AM 119349 3194 151763.64 11763.64 dbus-daemon
10:15:27 AM 119349 3199 158306 18306 Xorg
10:15:27 AM 119349 3242 15.28 5.8 gnome-shell
# pidstat -wt 3 10 > /tmp/pidstat-t.out
Linux 4.18.0-80.11.2.el8_0.x86_64 (hostname) 09/08/2020 _x86_64_ (4 CPU)
10:15:15 AM UID TGID TID cswch/s nvcswch/s Command
10:15:19 AM 0 1 - 152656.7 16656.7 systemd
10:15:19 AM 0 - 1 152656.7 16656.7 |__systemd
10:15:19 AM 0 9 - 165451.04 15451.04 ksoftirqd/0
10:15:19 AM 0 - 9 165451.04 15451.04 |__ksoftirqd/0
10:15:19 AM 0 10 - 158628.87 15828.87 rcu_sched
10:15:19 AM 0 - 10 158628.87 15828.87 |__rcu_sched
10:15:19 AM 0 23 - 129769.61 12979.61 ksoftirqd/2
10:15:19 AM 0 - 23 129769.61 12979.33 |__ksoftirqd/2
10:15:19 AM 0 29 - 32424.5 2445 ksoftirqd/3
10:15:19 AM 0 - 29 32424.5 2445 |__ksoftirqd/3
10:15:19 AM 0 43 - 334 34 khugepaged
10:15:19 AM 0 - 43 334 34 |__khugepaged
10:15:19 AM 0 443 - 11465 566 usb-storage
10:15:19 AM 0 - 443 6433 93 |__usb-storage
10:15:19 AM 0 456 - 15.41 0.00 i915/signal:0
10:15:19 AM 0 - 456 15.41 0.00 |__i915/signal:0
10:15:19 AM 0 715 - 19.34 0.00 xfsaild/dm-1
10:15:19 AM 0 - 715 19.34 0.00 |__xfsaild/dm-1
10:15:19 AM 0 886 - 23.28 0.00 irq/131-iwlwifi
10:15:19 AM 0 - 886 23.28 0.00 |__irq/131-iwlwifi
10:15:19 AM 0 966 - 19.67 0.00 xfsaild/dm-3
10:15:19 AM 0 - 966 19.67 0.00 |__xfsaild/dm-3
10:15:19 AM 81 1037 - 6.89 0.33 dbus-daemon
10:15:19 AM 81 - 1037 6.89 0.33 |__dbus-daemon
10:15:19 AM 0 1038 - 11567.31 4436 NetworkManager
10:15:19 AM 0 - 1038 1.31 0.00 |__NetworkManager
10:15:19 AM 0 - 1088 0.33 0.00 |__gmain
10:15:19 AM 0 - 1094 1340.66 0.00 |__gdbus
10:15:19 AM 998 1052 - 118755.1 11755.1 polkitd
10:15:19 AM 998 - 1052 32420.66 25545 |__polkitd
10:15:19 AM 998 - 1132 0.66 0.00 |__gdbus
Then with help of PID which is causing issue, one can get all system calls details:
# strace -c -f -p <pid of process/thread>
Let this command run for a few minutes while the load/context switch rates are high. It is safe to run this on a production system so you could run it on a good system as well to provide a comparative baseline. Through strace, one can debug & troubleshoot the issue, by looking at system calls the process has made.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments