[RHEL7] kdump service is reloaded many times during boot phase on a large memory system

Solution Verified - Updated -

Issue

  • kdump service is reloaded many times during boot phase on a large system.
  • The server is getting slow to repond or even stuck on boot where systemd-udevd reloads kdump service so many times.
  • As a result of these too many / frequent reload attempts of kdump service on boot, kernfs_mutex, that should be taken upon opening files on sysfs, is severely contended and udevd tends to be stuck in going through the /sys/devices/... entries.
-- Reboot --
Mar 15 18:06:30 localhost.localdomain systemd-journal[919]: Runtime journal is using 8.0M (max allowed 4.0G, trying to leave 4.0G free of 187.9G available → current limit 4.0G).
Mar 15 18:06:30 localhost.localdomain kernel: microcode: microcode updated early to revision 0x2006a08, date = 2020-06-16
Mar 15 18:06:30 localhost.localdomain kernel: Initializing cgroup subsys cpuset
Mar 15 18:06:30 localhost.localdomain kernel: Initializing cgroup subsys cpu
Mar 15 18:06:30 localhost.localdomain kernel: Initializing cgroup subsys cpuacct
Mar 15 18:06:30 localhost.localdomain kernel: Linux version 3.10.0-1160.11.1.rt56.1145.bz1926043.el7.x86_64 (mockbuild@x86-040.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP PREEMPT RT Mon Feb 22 16:24:06 UTC 2021
Mar 15 18:06:30 localhost.localdomain kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-1160.11.1.rt56.1145.bz1926043.el7.x86_64 root=/dev/mapper/rootvg-root ro audit=1 crashkernel=768M spectre_v2=retpoline rd.lvm.lv=rootvg/root selinux=0 ipv6.disable=0 console=ttyS1,115200 cgroup.memory=nokmem raid=noautodetect no_timer_check clock=tsc clocksource=tsc tsc=reliable rcu_nocbs=1-19,21-39,41-59,61-79 rcu_nocb_poll=1 nohz=on nohz_full=1-19,21-39,41-59,61-79 isolcpus=1-19,21-39,41-59,61-79 irqaffinity=0,20,40,60 enforcing=0 noswap default_hugepagesz=1G hugepagesz=1G hugepages=263 mce=off nmi_watchdog=1 fsck.mode=force fsck.repair=yes skew_tick=1 softlockup_panic=0 idle=poll nosoftlockup intel_pstate=disable intel_idle.max_cstate=1 iommu=pt intel_iommu=on pcie_aspm.policy=performance crash_kexec_post_notifiers systemd.log_level=debug systemd.log_target=kmsg log_buf_len=15M skew_tick=1 isolcpus=1-19,21-39,41-59,61-79 intel_pstate=disable nosoftlockup nohz=on nohz_full=1-19,21-39,41-59,61-79 rcu_nocbs=1-19,21-39,41-59,61-79
    ...
Mar 15 18:06:52 localhost.localdomain systemd-udevd[1690]: RUN '/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; /usr/bin/systemd-run --quiet /usr/bin/kdumpctl reload'' /usr/lib/udev/rules.d/98-kexec.rules:14
    ...
Mar 15 18:06:52 localhost.localdomain systemd-udevd[1692]: RUN '/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; /usr/bin/systemd-run --quiet /usr/bin/kdumpctl reload'' /usr/lib/udev/rules.d/98-kexec.rules:14
    ...
Mar 15 18:06:52 localhost.localdomain systemd-udevd[1691]: RUN '/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; /usr/bin/systemd-run --quiet /usr/bin/kdumpctl reload'' /usr/lib/udev/rules.d/98-kexec.rules:14
    ...
Mar 15 18:06:52 localhost.localdomain systemd-udevd[1689]: RUN '/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; /usr/bin/systemd-run --quiet /usr/bin/kdumpctl reload'' /usr/lib/udev/rules.d/98-kexec.rules:14
    ...
  • systemd-udevd runs kdumpctl reload 688 times on boot in this case:
$ cat ./sos_commands/logs/journalctl_--no-pager | awk '/Mar 15 18:/&&/kdumpctl reload/' | wc -l
688

Environment

  • Red Hat Enterprise Linux 7.3 and newer (non-rt kernel)
  • Red Hat Enterprise Linux 7.3 Realtime and newer (kernel-rt)
  • kexec-tools older than kexec-tools-2.0.15-51.el7_9.3
  • Large systems with large RAM
  • A system with 376GB RAM in one case
  • A system with 2TB RAM in another case
  • A system with only a few gigabytes of RAM sometimes

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content