Crashkernel Reservation Size in the Presence of Third Party Modules
Red Hat Lightspeed can detect this issue
Environment
- Red Hat Enterprise Linux 6
- Red Hat Enterprise Linux 7
- Red Hat Enterprise Linux 8
Issue
Since the size of the crashkernel reservation is determined on a sliding scale based on the amount of memory on the server, the use of crashkernel=auto may not accommodate the memory required to load third party modules. The use of this setting in the presence of third party modules may result in an undersized crashkernel reservation, leading to vmcore dumps failing, delaying both issue resolution and root cause analysis of unexpected reboots, crashes, or hangs.
Resolution
While crashkernel=auto is appropriate for most environments, its algorithm decides the amount of memory to reserve for the crashkernel based on system architecture and total RAM size. As previously stated, this does not account for third party modules that may be installed and loaded on a given system and the memory they require to load and successfully boot the crashkernel.
Since memory requirements can vary from module to module and server to server depending on various factors such as workload, environment, module version, hardware present, and so on, the presence of third party modules on a server may necessitate altering the kdump configuration in order to successfully capture a vmcore. Given the wide range and scope of possible third party modules and their various sizes, there are two best practice recommendations.
For third party modules that are installed and loaded and are required for kdump to successfully dump a vmcore, the crashkernel reservation size should be increased and manually set at the kernel command line. This may include but is not limited to third party networking drivers when dumping remotely via NFS or SSH or third party storage drivers when dumping to a device that utilizes that driver.
For third party modules that are installed and loaded but are not required for kdump to successfully dump a vmcore, the recommended best practice is to blacklist the unnecessary driver in the kdump configuration.
Diagnostics steps
-
kdump Deployment
-
Ensure the
kexec-toolspackage that provides kdump is installed and that the kdump service is active:# rpm -qa | grep kexec kexec-tools-2.0.20-14.el8.x86_64 -
For Red Hat Enterprise Linux 7 or later:
# systemctl status kdump ● kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: active (exited) since Fri 2021-01-01 02:21:00 EST; 2s ago [...] -
For Red Hat Enterprise Linux 6:
# chkconfig | grep kdump kdump 0:off 1:off 2:on 3:on 4:on 5:on 6:off -
If the kexec-tools package is absent or the kdump service is inactive, please reference the following article to install, enable, start, and configure kdump:
-
-
Find Third Party Modules
-
If you are unsure if your system has third party modules installed and loaded, you can check this in the
/proc/modulesfile. Third party modules can be identified by the presence of a kernel taint flag in the/proc/modulesoutput. Since taint flags will always be parenthesized in the/proc/modulesoutput, try grepping for an escaped parenthesis:# grep \( /proc/modules hello 16384 0 - Live 0xffffffffc050b000 (OE) -
The output above shows an example third party module installed and loaded named
hello. The taint flag (OE) indicates the module is a third party module. For more information on kernel taint flags, please see the following article:
-
-
Increase
crashkernelReservation for Modules Necessary forkdumpto Dump avmcore-
Once you've identified any third party modules loaded on your server and necessary to dump a vmcore successfully, you need to increase the
crashkernelreservation. The module size reported in commands such aslsmodor the/proc/modulesfile can be used only as starting points for the size by which to increase thecrashkernelreservation. Note that the module size indicated in commands such aslsmodor the/proc/modulesfile is only the size necessary to load and initialize the module. The actual memory usage from a given module's execution can be much higher, and can potentially vary by environment including the hardware used. For example the memory usage for a storage driver that scans 4 LUNs will be lower than that of the same driver that scans 400 LUNs. -
Note that the
/proc/modulesoutput does not provide headers, so you must remember that the second space-separated field of each line is the module's size. Thelsmodcommand does use headers, butlsmoddoes not distinguish between third party modules and Red Hat supported modules:# lsmod | head Module Size Used by hello 16384 0 intel_rapl_msr 16384 0 intel_rapl_common 24576 1 intel_rapl_msr isst_if_common 16384 0 nfit 65536 0 libnvdimm 192512 1 nfit crct10dif_pclmul 16384 1 crc32_pclmul 16384 0 ghash_clmulni_intel 16384 0 # grep \( /proc/modules hello 16384 0 - Live 0xffffffffc050b000 (OE) -
Once you have determined the size of memory required to initialize the third party modules installed on your server, you must manually set the
crashkernelsize. Use the size you have derived fromlsmodor/proc/modulesas a starting point and test kdump for successful vmcore collection. Setting thecrashkernelreservation manually is accomplished in different ways depending on the version of Red Hat Enterprise Linux you are running.
-
-
Setting the
crashkernelReservation-
In Red Hat Enterprise Linux 6
-
The
crashkernelreservation is made in Red Hat Enterprise Linux 6 by adding the desired value to the kernel line in/boot/grub/grub.conf, then rebooting the system. Here it is being set to 132 MB to account for third party modules:kernel /vmlinuz-2.6.32-754.el6.x86_64 ro root=/dev/mapper/vg_rhel6-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 rd_LVM_LV=vg_rhel6/lv_swap crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_LVM_LV=vg_rhel6/lv_root rd_NO_DM console=tty0 console=ttyS0,115200 crashkernel=132M # reboot now
-
-
In Red Hat Enterprise Linux 7
-
In Red Hat Enterprise Linux 7 the
crashkernelreservation is specified by naming it in/etc/default/grub, rebuilding the/boot/grub2/grub.cfgconfiguration file, and rebooting the system:GRUB_CMDLINE_LINUX="crashkernel=132M rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=tty0 console=ttyS0,115200" <<<---- crashkernel reservation specified on this line On BIOS-based machines: # grub2-mkconfig -o /boot/grub2/grub.cfg On UEFI-based machines: # grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg # reboot now
-
-
In Red Hat Enterprise Linux 8
-
In Red Hat Enterprise Linux 8 the
crashkernelreservation is set by using thekerneloptsline in/boot/grub2/grubenvthen rebooting the system:# grep crashkernel /boot/grub2/grubenv kernelopts=root=/dev/mapper/rhel-root ro crashkernel=132M@16M resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=tty0 console=ttyS0,115200 # reboot now
-
-
-
Test Kdump
-
The final step after modifying your kdump configuration in any way - including resizing the
crashkernelreservation or blacklisting modules - is to test kdump to ensure a vmcore is successfully dumped with the new configuration in use. This is accomplished by first ensuring that the SysRq is enabled:# echo 1 > /proc/sys/kernel/sysrq -
Then executing the following command. Warning: The following command will cause a kernel panic and crash your system!
# echo c > /proc/sysrq-trigger -
Finally, verify a vmcore exists in the target dump path:
# ls /var/crash/127.0.0.1-2020-08-21-12\:11\:00/ vmcore vmcore-dmesg.txt
-
-
Blacklisting Third Party Modules Unnecessary for
kdumpto Successfully Dump avmcore(Optional)-
For third party modules that are not needed to dump a vmcore, it is recommended to blacklist the modules within the kdump configuration. This is accomplished in different ways depending on the version of Red Hat Enterprise Linux you are running.
-
For Red Hat Enterprise Linux 6:
-
In Red Hat Enterprise Linux 6 kernel modules can be blacklisted by restarting the kdump service after editing the
/etc/kdump.conffile to include ablacklistline as seen in the example provided below:# grep -v ^# /etc/kdump.conf path /var/crash core_collector makedumpfile -c --message-level 1 -d 31 blacklist module1 module2 # service kdump restart Stopping kdump: [ OK ] Starting kdump: [ OK ]
-
-
For Red Hat Enterprise Linux 7 and later:
-
In Red Hat Enterprise Linux 7 and later modules are blacklisted on the ‘KDUMP_COMMANDLINE_APPEND’ line of the
/etc/sysconfig/kdumpfile with therd.blacklist.driverdirective. The kdump service must then be restarted:KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never nokaslr novmcoredd hest_disable rd.driver.blacklist=module1,module2,module3,module4" # systemctl restart kdump
-
-
Comments