Kdump not working if system configured with secure boot, many CPUs, and several PCI cards
Environment
- Red Hat Enterprise Linux 7 running kernel older than 7.3 errata kernel-3.10.0-514.6.1.el7
- Secure boot eanbled
- 128 or more CPUs
- Five or more PCI cards
Issue
Why is kdump not capturing cores when I have secure boot enabled, I have 128 or more CPUs, and I have five or more PCI cards installed in my system?
Resolution
Red Hat Enterprise Linux 7.3 errata kernel 3.10.0-514.6.1.el7 and newer kernels contain a fix to allocate enough ELF header space for all memory ranges. This corrects an issue that was preventing kdump from working properly on systems whose configurations match the environment described above.
If an older update release of Red Hat Enterprise Linux must be used, other workarounds for this problem include reducing the number of CPUs, reducing the number of PCI cards, or not using secure boot on the system. Making any of these adjustments or updating the kernel to the listed version (or newer) should allow kdump to begin working again.
Root Cause
Some hardware configurations, such as the one described in the "Environment" section, can contain small memory regions which are omitted by the kernel when allocating memory for kexec Executable and Linkable Format (ELF) files. Consequently, the operating system terminates unexpectedly when loading the kdump kernel.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
