Red Hat Training

A Red Hat training course is available for RHEL 8

Chapter 15. Configuring kdump on the command line

The memory for kdump is reserved during the system boot. The memory size is configured in the system’s Grand Unified Bootloader (GRUB) configuration file. The memory size depends on the crashkernel= value specified in the configuration file and the size of the system’s physical memory.

15.1. Estimating the kdump size

When planning and building your kdump environment, it is important to know how much space the crash dump file requires.

The makedumpfile --mem-usage command estimates how much space the crash dump file requires. It generates a memory usage report. The report helps you determine the dump level and which pages are safe to be excluded.

Procedure

  • Execute the following command to generate a memory usage report:

    # makedumpfile --mem-usage /proc/kcore
    
    
    TYPE        PAGES    EXCLUDABLE    DESCRIPTION
    -------------------------------------------------------------
    ZERO          501635      yes        Pages filled with zero
    CACHE         51657       yes        Cache pages
    CACHE_PRIVATE 5442        yes        Cache pages + private
    USER          16301       yes        User process pages
    FREE          77738211    yes        Free pages
    KERN_DATA     1333192     no         Dumpable kernel data
Important

The makedumpfile --mem-usage command reports required memory in pages. This means that you must calculate the size of memory in use against the kernel page size.

By default the RHEL kernel uses 4 KB sized pages on AMD64 and Intel 64 CPU architectures, and 64 KB sized pages on IBM POWER architectures.

15.2. Configuring kdump memory usage

The memory reservation for kdump occurs during the system boot. The memory size is set in the system’s Grand Unified Bootloader (GRUB) configuration. The memory size depends on the value of the crashkernel= option specified in the configuration file and the size of the system physical memory.

You can define the crashkernel= option in many ways. You can specify the crashkernel= value or configure the auto option. The crashkernel=auto parameter reserves memory automatically, based on the total amount of physical memory in the system. When configured, the kernel automatically reserves an appropriate amount of required memory for the capture kernel. This helps to prevent Out-of-Memory (OOM) errors.

Note

The automatic memory allocation for kdump varies based on system hardware architecture and available memory size.

For example, on AMD64 and Intel 64, the crashkernel=auto parameter works only when the available memory is more than 1GB. The 64-bit ARM architecture and IBM Power Systems needs more than 2 GB of available memory.

If the system has less than the minimum memory threshold for automatic allocation, you can configure the amount of reserved memory manually.

Prerequisites

Procedure

  1. Prepare the crashkernel= option.

    • For example, to reserve 128 MB of memory, use the following:

      crashkernel=128M
    • Alternatively, you can set the amount of reserved memory to a variable depending on the total amount of installed memory. The syntax for memory reservation into a variable is crashkernel=<range1>:<size1>,<range2>:<size2>. For example:

      crashkernel=512M-2G:64M,2G-:128M

      The command reserves 64 MB of memory if the total amount of system memory is in the range of 512 MB and 2 GB. If the total amount of memory is more than 2 GB, the memory reserve is 128 MB.

    • Offset the reserved memory.

      Some systems require to reserve memory with a certain fixed offset because the crashkernel reservation happens early, and you may need to reserve more memory for special usage. When you define an offset, the reserved memory begins there. To offset the reserved memory, use the following syntax:

      crashkernel=128M@16M

      In this example, kdump reserves 128 MB of memory starting at 16 MB (physical address 0x01000000). If you set the offset parameter to 0 or omit entirely, kdump offsets the reserved memory automatically. You can also use this syntax when setting a variable memory reservation. In that case, the offset is always specified last. For example:

      crashkernel=512M-2G:64M,2G-:128M@16M
  2. Apply the crashkernel= option to your boot loader configuration:

    # grubby --update-kernel=ALL --args="crashkernel=<value>"

    Replace <value> with the value of the crashkernel= option that you prepared in the previous step.

15.3. Configuring the kdump target

The crash dump is usually stored as a file in a local file system, written directly to a device. Alternatively, you can set up for the crash dump to be sent over a network using the NFS or SSH protocols. Only one of these options to preserve a crash dump file can be set at a time. The default behavior is to store it in the /var/crash/ directory of the local file system.

Prerequisites

Procedure

  • To store the crash dump file in /var/crash/ directory of the local file system, edit the /etc/kdump.conf file and specify the path:

    path /var/crash

    The option path /var/crash represents the path to the file system in which kdump saves the crash dump file.

    Note
    • When you specify a dump target in the /etc/kdump.conf file, then the path is relative to the specified dump target.
    • When you do not specify a dump target in the /etc/kdump.conf file, then the path represents the absolute path from the root directory.

    Depending on what is mounted in the current system, the dump target and the adjusted dump path are taken automatically.

    To secure the crash dump file and the accompanying files produced by kdump, you should set up proper attributes for the target destination directory, such as user permissions and SELinux contexts. Additionally, you can define a script, for example kdump_post.sh in the kdump.conf file as follows:

    kdump_post <path_to_kdump_post.sh>

    The kdump_post directive specifies a shell script or a command that is executed after kdump has completed capturing and saving a crash dump to the specified destination. You can use this mechanism to extend the functionality of kdump to perform actions including the adjustment of file permissions.

Example 15.1. The kdump target configuration

# grep -v ^# /etc/kdump.conf | grep -v ^$
ext4 /dev/mapper/vg00-varcrashvol
path /var/crash
core_collector makedumpfile -c --message-level 1 -d 31

Here, the dump target is specified (ext4 /dev/mapper/vg00-varcrashvol), and thus mounted at /var/crash. The path option is also set to /var/crash, so the kdump saves the vmcore file in the /var/crash/var/crash directory.

  • To change the local directory in which the crash dump is to be saved, as root, edit the /etc/kdump.conf configuration file:

    1. Remove the hash sign (#) from the beginning of the #path /var/crash line.
    2. Replace the value with the intended directory path. For example:

      path /usr/local/cores
      Important

      In RHEL 8, the directory defined as the kdump target using the path directive must exist when the kdump systemd service starts to avoid failures. This behavior is different from versions earlier than RHEL, where the directory is created automatically if it did not exist when the service starts.

  • To write the file to a different partition, edit the /etc/kdump.conf configuration file:

    1. Remove the hash sign (#) from the beginning of the #ext4 line, depending on your choice.

      • device name (the #ext4 /dev/vg/lv_kdump line)
      • file system label (the #ext4 LABEL=/boot line)
      • UUID (the #ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937 line)
    2. Change the file system type and the device name, label or UUID, to the required values. The correct syntax for specifying UUID values is both UUID="correct-uuid" and UUID=correct-uuid. For example:

      ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937
      Important

      It is recommended to specify storage devices using a LABEL= or UUID=. Disk device names such as /dev/sda3 are not guaranteed to be consistent across reboot.

      When you use Direct Access Storage Device (DASD) on IBM Z hardware, ensure the dump devices are correctly specified in /etc/dasd.conf before you proceed with kdump.

  • To write the crash dump directly to a device, edit the /etc/kdump.conf configuration file:

    1. Remove the hash sign (#) from the beginning of the #raw /dev/vg/lv_kdump line.
    2. Replace the value with the intended device name. For example:

      raw /dev/sdb1
  • To store the crash dump to a remote machine using the NFS protocol:

    1. Remove the hash sign (#) from the beginning of the #nfs my.server.com:/export/tmp line.
    2. Replace the value with a valid hostname and directory path. For example:

      nfs penguin.example.com:/export/cores
    3. Restart the kdump service for the changes to take effect:

      sudo systemctl restart kdump.service
      Note

      When using the NFS directive to specify the NFS target, kdump.service automatically attempts to mount the NFS target to check the disk space. There is no need to mount the NFS target beforehand. To prevent kdump.service from mounting the target, use the dracut_args --mount directive in kdump.conf so that kdump.service calls the dracut utility with the --mount argument to specify the NFS target.

  • To store the crash dump to a remote machine using the SSH protocol:

    1. Remove the hash sign (#) from the beginning of the #ssh user@my.server.com line.
    2. Replace the value with a valid username and hostname.
    3. Include your SSH key in the configuration.

      1. Remove the hash sign from the beginning of the #sshkey /root/.ssh/kdump_id_rsa line.
      2. Change the value to the location of a key valid on the server you are trying to dump to. For example:

        ssh john@penguin.example.com
        sshkey /root/.ssh/mykey

15.4. Configuring the kdump core collector

The kdump service uses a core_collector program to capture the crash dump image. In RHEL, the makedumpfile utility is the default core collector. It helps shrink the dump file by:

  • Compressing the size of a crash dump file and copying only necessary pages using various dump levels.
  • Excluding unnecessary crash dump pages.
  • Filtering the page types to be included in the crash dump.

Syntax

core_collector makedumpfile -l --message-level 1 -d 31

Options

  • -c, -l or -p: specify compress dump file format by each page using either, zlib for -c option, lzo for -l option or snappy for -p option.
  • -d (dump_level): excludes pages so that they are not copied to the dump file.
  • --message-level : specify the message types. You can restrict outputs printed by specifying message_level with this option. For example, specifying 7 as message_level prints common messages and error messages. The maximum value of message_level is 31

Prerequisites

Procedure

  1. As root, edit the /etc/kdump.conf configuration file and remove the hash sign ("#") from the beginning of the #core_collector makedumpfile -l --message-level 1 -d 31.
  2. To enable crash dump file compression, execute:
core_collector makedumpfile -l --message-level 1 -d 31

The -l option specifies the dump compressed file format. The -d option specifies dump level as 31. The --message-level option specifies message level as 1.

Also, consider following examples with the -c and -p options:

  • To compress a crash dump file using -c:
core_collector makedumpfile -c -d 31 --message-level 1
  • To compress a crash dump file using -p:
core_collector makedumpfile -p -d 31 --message-level 1

Additional resources

15.5. Configuring the kdump default failure responses

By default, when kdump fails to create a crash dump file at the configured target location, the system reboots and the dump is lost in the process. You can change the default failure response and configure kdump to perform a different operation in case it fails to save the core dump to the primary target. The additional actions are:

dump_to_rootfs
Saves the core dump to the root file system.
reboot
Reboots the system, losing the core dump in the process.
halt
Stops the system, losing the core dump in the process.
poweroff
Power the system off, losing the core dump in the process.
shell
Runs a shell session from within the initramfs, you can record the core dump manually.
final_action
Enables additional operations such as reboot, halt, and poweroff after a successful kdump or when shell or dump_to_rootfs failure action completes. The default is reboot.
failure_action
Specifies the action to perform when a dump might fail in a kernel crash. The default is reboot.

Prerequisites

Procedure

  1. As root, remove the hash sign (#) from the beginning of the #failure_action line in the /etc/kdump.conf configuration file.
  2. Replace the value with a desired action.

    failure_action poweroff

Additional resources

15.6. Configuration file for kdump

The configuration file for kdump kernel is /etc/sysconfig/kdump. This file controls the kdump kernel command line parameters. For most configurations, use the default options. However, in some scenarios you might need to modify certain parameters to control the kdump kernel behavior. For example, modifying the KDUMP_COMMANDLINE_APPEND option to append the kdump kernel command-line to obtain a detailed debugging output or the KDUMP_COMMANDLINE_REMOVE option to remove arguments from the kdump command line.

KDUMP_COMMANDLINE_REMOVE

This option removes arguments from the current kdump command line. It removes parameters that may cause kdump errors or kdump kernel boot failures. These parameters may have been parsed from the previous KDUMP_COMMANDLINE process or inherited from the /proc/cmdline file.

When this variable is not configured, it inherits all values from the /proc/cmdline file. Configuring this option also provides information that is helpful in debugging an issue.

To remove certain arguments, add them to KDUMP_COMMANDLINE_REMOVE as follows:

KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet log_buf_len swiotlb"
KDUMP_COMMANDLINE_APPEND

This option appends arguments to the current command line. These arguments may have been parsed by the previous KDUMP_COMMANDLINE_REMOVE variable.

For the kdump kernel, disabling certain modules such as mce, cgroup, numa, hest_disable can help prevent kernel errors. These modules may consume a significant portion of the kernel memory reserved for kdump or cause kdump kernel boot failures.

To disable memory cgroups on the kdump kernel command line, run the command as follows:

KDUMP_COMMANDLINE_APPEND="cgroup_disable=memory"

Additional resources

  • The Documentation/admin-guide/kernel-parameters.txt file
  • The /etc/sysconfig/kdump file

15.7. Testing the kdump configuration

After configuring kdump, you must manually test a system crash and ensure that the vmcore file is generated in the defined kdump target. The vmcore file is captured from the context of the freshly booted kernel and therefore has critical information to help debug a kernel crash.

Warning

Do not test kdump on active production systems. The commands to test kdump will cause the kernel to crash with loss of data. Depending on your system architecture, ensure that you schedule significant maintenance time because kdump testing might require several reboots with a long boot time.

If the vmcore file is not generated during the kdump test, identify and fix issues before you run the test again for a successful kdump testing.

Important

Ensure that you schedule significant maintenance time, because kdump testing might require several reboots with a long boot time.

If you make any manual system modifications, you must test the kdump configuration at the end of any system modification. For example, if you make any of the following changes, ensure that you test the kdump configuration for an optimal kdump performance:

  • Package upgrades.
  • Hardware level changes, for example, storage or networking changes.
  • Firmware and BIOS upgrades.
  • New installation and application upgrades that include third party modules.
  • If you use the hot-plugging mechanism to add more memory on hardware that support this mechanism.
  • After you make changes in the /etc/kdump.conf or /etc/sysconfig/kdump file.

Prerequisites

  • You have root permissions on the system.
  • You have saved all important data. The commands to test kdump cause the kernel to crash with loss of data.
  • You have scheduled significant machine maintenance time depending on the system architecture.

Procedure

  1. Enable the kdump service:

    # kdumpctl restart
  2. Check the status of the kdump service. With the kdumpctl command, you can print the output at the console.

    # kdumpctl status
      kdump:Kdump is operational

    Alternatively, if you use the systemctl command, the output prints in the systemd journal.

  3. Initiate a kernel crash to test the kdump configuration. The sysrq-trigger key combination causes the kernel to crash and might reboot the system if required.

    # echo c > /proc/sysrq-trigger

    On a kernel reboot, the address-YYYY-MM-DD-HH:MM:SS/vmcore file is created at the location you have specified in the /etc/kdump.conf file. The default is /var/crash/.

Additional resources

15.8. Files produced by kdump after system crash

After your system crashes, the kdump service captures the kernel memory in a dump file (vmcore) and it also generates additional diagnostic files to aid in troubleshooting and post-mortem analysis.

Files produced by kdump:

  • vmcore - main kernel memory dump file containing system memory at the time of the crash. It includes data as per the configuration of the core_collector program specified in kdump configuration. By default the kernel data structures, process information, stack traces, and other diagnostic information.
  • vmcore-dmesg.txt - contents of the kernel ring buffer log (dmesg) from the primary kernel that panicked.
  • kexec-dmesg.log - contains kernel and system log messages from the execution of the secondary kexec kernel that collects the vmcore data.

15.9. Enabling and disabling the kdump service

You can configure to enable or disable the kdump functionality on a specific kernel or on all installed kernels. You must routinely test the kdump functionality and validate that it is working properly.

Prerequisites

  • You have root permissions on the system.
  • You have completed kdump requirements for configurations and targets. See Supported kdump configurations and targets.
  • All configurations for installing kdump are set up as required.

Procedure

  • Enable the kdump service for multi-user.target:

    # systemctl enable kdump.service
  • Start the service in the current session:

    # systemctl start kdump.service
  • Stop the kdump service:

    # systemctl stop kdump.service
  • Disable the kdump service:

    # systemctl disable kdump.service
Warning

It is recommended to set kptr_restrict=1 as default. When kptr_restrict is set to (1) as default, the kdumpctl service loads the crash kernel even if Kernel Address Space Layout (KASLR) is enabled or not enabled.

If kptr_restrict is not set to 1 and KASLR is enabled, the contents of /proc/kore file are generated as all zeros. The kdumpctl service fails to access the /proc/kcore file and load the crash kernel. The kexec-kdump-howto.txt file displays a warning message, which recommends you to set kptr_restrict=1. Verify for the following in the sysctl.conf file to ensure that kdumpctl service loads the crash kernel:

  • Kernel kptr_restrict=1 in the sysctl.conf file.

15.10. Preventing kernel drivers from loading for kdump

You can control the capture kernel from loading certain kernel drivers by adding the KDUMP_COMMANDLINE_APPEND= variable in the /etc/sysconfig/kdump configuration file. By using this method, you can prevent the kdump initial RAM disk image initramfs from loading the specified kernel module. This helps to prevent the out-of-memory (OOM) killer errors or other crash kernel failures.

You can append the KDUMP_COMMANDLINE_APPEND= variable using one of the following configuration options:

  • rd.driver.blacklist=<modules>
  • modprobe.blacklist=<modules>

Prerequisites

  • You have root permissions on the system.

Procedure

  1. Display the list of modules that are loaded to the currently running kernel. Select the kernel module that you intend to block from loading.

    $ lsmod
    
    Module                  Size  Used by
    fuse                  126976  3
    xt_CHECKSUM            16384  1
    ipt_MASQUERADE         16384  1
    uinput                 20480  1
    xt_conntrack           16384  1
  2. Update the KDUMP_COMMANDLINE_APPEND= variable in the /etc/sysconfig/kdump file. For example:

    KDUMP_COMMANDLINE_APPEND="rd.driver.blacklist=hv_vmbus,hv_storvsc,hv_utils,hv_netvsc,hid-hyperv"

    Also, consider the following example using the modprobe.blacklist=<modules> configuration option:

    KDUMP_COMMANDLINE_APPEND="modprobe.blacklist=emcp modprobe.blacklist=bnx2fc modprobe.blacklist=libfcoe modprobe.blacklist=fcoe"
  3. Restart the kdump service:

    # systemctl restart kdump

Additional resources

  • dracut.cmdline man page

15.11. Running kdump on systems with encrypted disk

When you run a LUKS encrypted partition, systems require certain amount of available memory. If the system has less than the required amount of available memory, the cryptsetup utility fails to mount the partition. As a result, capturing the vmcore file to an encrypted target location fails in the second kernel (capture kernel).

The kdumpctl estimate command helps you estimate the amount of memory you need for kdump. kdumpctl estimate prints the recommended crashkernel value, which is the most suitable memory size required for kdump.

The recommended crashkernel value is calculated based on the current kernel size, kernel module, initramfs, and the LUKS encrypted target memory requirement.

In case you are using the custom crashkernel= option, kdumpctl estimate prints the LUKS required size value. The value is the memory size required for LUKS encrypted target.

Procedure

  1. Print the estimate crashkernel= value:

    # *kdumpctl estimate*
    
    Encrypted kdump target requires extra memory, assuming using the keyslot with minimum memory requirement
       Reserved crashkernel:    256M
       Recommended crashkernel: 652M
    
       Kernel image size:   47M
       Kernel modules size: 8M
       Initramfs size:      20M
       Runtime reservation: 64M
       LUKS required size:  512M
       Large modules: <none>
       WARNING: Current crashkernel size is lower than recommended size 652M.
  2. Configure the amount of required memory by increasing the crashkernel= value.
  3. Reboot the system.
Note

If the kdump service still fails to save the dump file to the encrypted target, increase the crashkernel= value as required.