RHEL6: corrupted ELF header in /proc/vmcore causes kdump to fail
Issue
On some systems, kdump frequently fails on one of our server with the following message.
[Tue Jul 21 07:23:19 2015]Saving vmcore-dmesg.txt
[Tue Jul 21 07:23:19 2015]Cannot malloc 5999218776 bytes
[Tue Jul 21 07:23:19 2015]Saving vmcore-dmesg.txt failed
The problem is caused by a corrupted elf header in /proc/vmcore with a 6GB note section. When kdump succeeds, the size of the note section is 18620 bytes.
The issue occurs in 3 out of 4 attempts to create a kdump. It is not important how the kdump is triggered, i.e.
# echo c > /proc/sysrq-trigger
can be used.
When the issue occurs, "vmcore-incomplete" is output to the kdumps directory below /var/crash/ and above mentioned error is output to the console.
When /proc/vmcore in 2nd kernel is checked, it can be observed that the size of note0 turns to a big value:
# objdump --section-headers issue_vmcore > issue_vmcore_objdump.log
# head issue_vmcore_objdump.log
issue_vmcore: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 note0 16594d058 0000000000000000 0000000000000000 00000238 2**0
CONTENTS, READONLY
1 .reg/0 000000d8 0000000000000000 0000000000000000 000002bc 2**2
CONTENTS
2 .reg 000000d8 0000000000000000 0000000000000000 000002bc 2**2
Environment
- Red Hat Enterprise Linux (RHEL) 6, minor versions lower than 8
- kexec/kdump
- x86_64 architecture
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.