How should the crashkernel parameter be configured for using kdump on RHEL6?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 6
  • x86 and x86_64 architecture

Issue

  • What is the correct crashkernel parameter for kdump to work?
  • "crashkernel reservation failed - memory is in use" errors when kernel panics
  • When configuring the crashkernel parameter the kdump service either fails to start or starts with this warning:
Your running kernel is using more than 70% of the amount of space you reserved for 
kdump, you should consider increasing your crashkernel reservation
  • kdump service restart failed with below error
Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel

Resolution

The kdump procedure

The received warning means the kdump operation might fail and the crashdump parameter should be configured correctly. This is the procedure of kdumping:

  1. The normal kernel is booted with crashkernel=... as a kernel option, reserving some memory for the kdump kernel. The memory reserved by the crashkernel parameter is not available to the normal kernel during regular operation.  It is reserved for later use by the kdump kernel.
  2. The system panics.
  3. The kdump kernel is booted using kexec, it used the memory area that was reserved w/ the crashkernel parameter.
  4. The normal kernel's memory is captured into a vmcore.

Note: Not reserving enough memory for the kdump kernel can lead to the kdump operation failing.

Configuring crashkernel on RHEL6.0 and RHEL6.1 kernels

The code for printing the warning:

Your running kernel is using more than 70% of the amount of space you reserved for 
kdump, you should consider increasing your crashkernel reservation

is part of the script /etc/init.d/kdump.

The involved code

  • First reads the Slab value from /proc/meminfo. Slab is the in kernel data structures cache, this value depends on the total amount of RAM present in the system as well as on other factors. The value is not consistent and can change during operation of the server.
  • If the Slab value is bigger than 70% of the memory that was reserved with the crashkernel parameter then the warning is printed.Some mappings of ram and appropriate crashkernel values:
ram size crashkernel parameter ram / crashkernel factor
>0GB 128MB 15
>2GB 256MB 23
>6GB 512MB 15
>8GB 768MB 31

The last column contains a ram/crashkernel factor.

The table is covered by the following crashkernel configuration:

crashkernel=0M-2G:128M,2G-6G:256M,6G-8G:512M,8G-:768M

For servers with more RAM it is recommended to compute the crashkernel parameter using the factors that have been observed so far: 15 to stay on a safe side (maybe wasting memory), using a factor of 20 should also work. Please also note that the maximum size of RAM that should be reserved here is 896M, as outlined in (private) bz580843.

Configuring crashkernel on RHEL6.2 (and later) kernels

Starting with RHEL6.2 kernels crashkernel=auto should be used. The kernel will automatically reserve an appropriate amount of memory for the kdump kernel.

Keep in mind that it is an algorithmically calculated memory reservation and might not meet the needs of all systems (Especially for configurations with lots of IO cards and loaded drivers). So always make sure that memory reserved by crashkernel=auto is sufficient for the target machine by testing kdump. If it is not, reserve more memory by syntax crashkernel= XM (X is amount of memory to be reserved in mega bytes).

Additionally some improvements have been made in the RHEL6.2 kernel which have reduced the overall memory requirements of kdump. For more details refer to article kdump memory usage improvements included in Red Hat Enterprise Linux 6.2.

The amount of memory reserved for the kdump kernel can be estimated with the following scheme:

base memory to be reserved = 128MB  
an additional 64MB added for each TB of physical RAM present in the system. So 
for example if a system has 1TB of memory 192MB (128MB + 64MB) will be reserved.

Note: It is recommended to test and verify that kdump is working on all systems after installation of all applications. The memory reserved by crashkernel=auto takes only typical RHEL configurations into account. Some hardware and larger configurations with many option cards may not work well with with crashkernel=auto, in this case the use of crashkernel=512M or more may be a recommended size to start. Additionally if 3rd party modules are used, more memory might have to be reserved. Thus, if a testdump fails it is a good strategy to verify if it works with crashkernel=768M@0M and if it does, do further debugging of the memory requirements using the debug_mem_level option in /etc/kdump.conf. It is recommended that until a test dump works without failure that kdump not be considered configured properly.

Note: Prior to the 6.3GA release, crashkernel=auto will only reserve memory on systems with 4GB or more physical memory. If the system has less than 4GB of memory the memory must be reserved by explicitly requesting the reservation size, for example: crashkernel=128M. Since the 6.3GA release (kernel-2.6.32-279.el6), this limit has been lowered to 2GB.

Note: Some environments still require manual configuration of the crashkernel option, for example if dumps to very large local filesystems are performed. Please refer to kdump fails with large ext4 file system because fsck.ext4 gets OOM-killed for details.

Further information

Root Cause

A number of improvements related to crashkernel=auto and memory requirements of kdump have been made in the RHEL6.2 kernel.

Diagnostic Steps

  • The method used (pre-6.2) to calculate the approx amount of ram the normal kernel is using (from the /etc/init.d/kdump):
KMEMINUSE=`awk '/Slab:.*/ {print $2}' /proc/meminfo`
  • Question: Is it possible to find out how much memory was reserved for the kdump kernel?
    Answer: This is available when executing cat /proc/cmdline. Even when the kernel was started with crashkernel=auto then /proc/cmdline will contain the computed value that got reserved. To verify that crashkernel=auto was really used the contents of /var/log/dmesg can be used.
    • cat /proc/cmdline
    • cat /sys/kernel/kexec_crash_size
  • Question: I found out that 'sync; echo 3 > /proc/sys/vm/drop_caches' frees up Slab, can I use this regularly and then use a lower value for 'crashkernel'?
    Answer: This is not recommended. This command is dropping filesystem caches, when after execution data is requested by processes the data has to be read from disc/blockdevices, resulting in a degraded system performance.
  • Question: On my system I did setup kdump. When triggering the kdump then kdump is not loaded completely.
    Answer: Are 3rd party drivers in use on the system, changing memory requirements? Does the system successfully kdump when crashkernel=768M@0M is used, or a different manual allocation that is bigger than the amount of memory that crashkernel=auto did reserve for the crash kernel? If this is the case then with the debug_mem_level option in /etc/kdump.conf the required amount of memory can be found out and the memory that has to be reserved for the crashkernel can be cut down.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments