After updating to RHEL7.5 or later kernel panics due to invalid memory access provided by "KsFree()" function of "oracleoks" kernel module.
Environment
- kernel-3.10.0-862.el7 or later
- Red Hat Enterprise Linux 7.5
oracleoks
third party kernel module.
Issue
- After updating to RHEL7.5 or later kernel panics due to invalid memory access provided by "KsFree()" function of "oracleoks" kernel module.
crash> bt
PID: 33293 TASK: ffff982bc5e3af70 CPU: 4 COMMAND: "modprobe"
#0 [ffff982caff9fad8] machine_kexec at ffffffffbde629da
#1 [ffff982caff9fb38] __crash_kexec at ffffffffbdf16692
#2 [ffff982caff9fc08] crash_kexec at ffffffffbdf16780
#3 [ffff982caff9fc20] oops_end at ffffffffbe51d728
#4 [ffff982caff9fc48] no_context at ffffffffbe50c6cd
#5 [ffff982caff9fc98] __bad_area_nosemaphore at ffffffffbe50c764
#6 [ffff982caff9fce8] bad_area_nosemaphore at ffffffffbe50c8d5
#7 [ffff982caff9fcf8] __do_page_fault at ffffffffbe5206e0
#8 [ffff982caff9fd60] do_page_fault at ffffffffbe5208d5
#9 [ffff982caff9fd90] page_fault at ffffffffbe51c758
[exception RIP: kfree+85]
RIP: ffffffffbdffa5f5 RSP: ffff982caff9fe40 RFLAGS: 00010282
RAX: ffffdbf1b0600040 RBX: ffffb2f758001000 RCX: 0000000000000004
RDX: 000067e440000000 RSI: 0000000003d09000 RDI: ffffb2f758001000
RBP: ffff982caff9fe58 R8: 000000000001bb20 R9: ffffffffc081e9dd
R10: ffff983afe71bb20 R11: ffffdb868549fe00 R12: 0000000003d09000
R13: ffffffffc081e9dd R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#10 [ffff982caff9fe60] KsFree at ffffffffc081e9dd [oracleoks]
#11 [ffff982caff9fe80] odlm_rsb_free_tbl at ffffffffc082bb4f [oracleoks]
#12 [ffff982caff9fe90] odlm_subsys_unconfigure at ffffffffc082b52b [oracleoks]
#13 [ffff982caff9feb8] cleanup_module at ffffffffc08538ae [oracleoks]
#14 [ffff982caff9fec8] sys_delete_module at ffffffffbdf0fe8e
#15 [ffff982caff9ff50] system_call_fastpath at ffffffffbe52579b
RIP: 00007fc51c8f5027 RSP: 00007ffd37575f08 RFLAGS: 00000246
RAX: 00000000000000b0 RBX: 00000000025ca6b0 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000025ca718
RBP: 0000000000000000 R8: 00007fc51cbbd060 R9: 00007fc51c9691a0
R10: 0000000000000000 R11: 0000000000000206 R12: 00000000025c8210
R13: 0000000000000000 R14: 00000000025c8508 R15: 0000000000000000
ORIG_RAX: 00000000000000b0 CS: 0033 SS: 002b
Resolution
- Contact Oracle Support for further investigation as the panic occurs inside the
oracleoks
driver. Access tooracleoks
source code will be required to troubleshoot further. - If Oracle support has feedback or questions for Red Hat pass this information into the Red Hat case.
- If required Red Hat and Oracle can collaborate on this issue
Root Cause
- The vmcore indicates an invalid paging request at address as the root cause of the panic.
- This occurs in the kernel function
kfree()
. kfree()
expects a single argument, a pointer to an object allocated withkmalloc()
. However we can see that the argument does not point to any object in a slab cache.kfree()
is called by the "oracleoks" functionKsFree()
.- This address instead marks the start of vm_struct address range. This address range was allocated by the "oracleoks" function
KsMalloc()
.
Diagnostic Steps
- Vmcore findings.
CPUS: 6
DATE: Fri Sep 28 16:04:49 2018
UPTIME: 00:54:41
LOAD AVERAGE: 1.48, 1.43, 1.29
TASKS: 362
RELEASE: 3.10.0-862.11.6.el7.x86_64
VERSION: #1 SMP Fri Aug 10 16:55:11 UTC 2018
MACHINE: x86_64 (3099 Mhz)
MEMORY: 128 GB
PANIC: "BUG: unable to handle kernel paging request at ffffdbf1b0600040"
crash> bt
PID: 33293 TASK: ffff982bc5e3af70 CPU: 4 COMMAND: "modprobe"
#0 [ffff982caff9fad8] machine_kexec at ffffffffbde629da
#1 [ffff982caff9fb38] __crash_kexec at ffffffffbdf16692
#2 [ffff982caff9fc08] crash_kexec at ffffffffbdf16780
#3 [ffff982caff9fc20] oops_end at ffffffffbe51d728
#4 [ffff982caff9fc48] no_context at ffffffffbe50c6cd
#5 [ffff982caff9fc98] __bad_area_nosemaphore at ffffffffbe50c764
#6 [ffff982caff9fce8] bad_area_nosemaphore at ffffffffbe50c8d5
#7 [ffff982caff9fcf8] __do_page_fault at ffffffffbe5206e0
#8 [ffff982caff9fd60] do_page_fault at ffffffffbe5208d5
#9 [ffff982caff9fd90] page_fault at ffffffffbe51c758
[exception RIP: kfree+85]
RIP: ffffffffbdffa5f5 RSP: ffff982caff9fe40 RFLAGS: 00010282
RAX: ffffdbf1b0600040 RBX: ffffb2f758001000 RCX: 0000000000000004
RDX: 000067e440000000 RSI: 0000000003d09000 RDI: ffffb2f758001000
RBP: ffff982caff9fe58 R8: 000000000001bb20 R9: ffffffffc081e9dd
R10: ffff983afe71bb20 R11: ffffdb868549fe00 R12: 0000000003d09000
R13: ffffffffc081e9dd R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#10 [ffff982caff9fe60] KsFree at ffffffffc081e9dd [oracleoks]
#11 [ffff982caff9fe80] odlm_rsb_free_tbl at ffffffffc082bb4f [oracleoks]
#12 [ffff982caff9fe90] odlm_subsys_unconfigure at ffffffffc082b52b [oracleoks]
#13 [ffff982caff9feb8] cleanup_module at ffffffffc08538ae [oracleoks]
#14 [ffff982caff9fec8] sys_delete_module at ffffffffbdf0fe8e
#15 [ffff982caff9ff50] system_call_fastpath at ffffffffbe52579b
RIP: 00007fc51c8f5027 RSP: 00007ffd37575f08 RFLAGS: 00000246
RAX: 00000000000000b0 RBX: 00000000025ca6b0 RCX: ffffffffffffffff
RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000025ca718
RBP: 0000000000000000 R8: 00007fc51cbbd060 R9: 00007fc51c9691a0
R10: 0000000000000000 R11: 0000000000000206 R12: 00000000025c8210
R13: 0000000000000000 R14: 00000000025c8508 R15: 0000000000000000
ORIG_RAX: 00000000000000b0 CS: 0033 SS: 002b
crash> mod -t
NAME TAINTS
oracleoks POE
crash> kmem ffffdbf1b0600040
kmem: WARNING: cannot make virtual-to-physical translation: ffffdbf1b0600040
ffffdbf1b0600040: kernel virtual address not found in mem map
- There's only one variable here - the object pointer passed into kfree.
crash> kmem ffffb2f758001000
VMAP_AREA VM_STRUCT ADDRESS RANGE SIZE
ffff982cdbe42500 ffff982cdfbb4380 ffffb2f758001000 - ffffb2f75bd0b000 64004096
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffdb86863cf400 118f3d0000 0 0 1 2fffff00000000
- This address range was originally allocated via KsMalloc -> vmalloc
crash> vm_struct ffff982cdfbb4380
struct vm_struct {
next = 0x0,
addr = 0xffffb2f758001000,
size = 64004096,
flags = 18,
pages = 0xffffb2f74c655000,
nr_pages = 15625,
phys_addr = 0,
caller = 0xffffffffc081caf8
}
crash> sym 0xffffffffc081caf8
ffffffffc081caf8 (w) KsMalloc+392 [oracleoks]
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments