Kernel Panic in get_memory_error_data
Issue
o Repeated panics in the sb_edac
module at the function get_memory_error_data
o Crashes are sporadic and may occur at boot or any time after
crash> sys | grep -i -e rel -e panic
RELEASE: 2.6.32-573.38.1.el6.x86_64
PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000038"
crash> bt
PID: 7415 TASK: ffff88ff10e24040 CPU: 55 COMMAND: "edac-poller"
#0 [ffff88ff16c736f0] machine_kexec at ffffffff8103d1fb
#1 [ffff88ff16c73750] crash_kexec at ffffffff810cc632
#2 [ffff88ff16c73820] oops_end at ffffffff8153d970
#3 [ffff88ff16c73850] no_context at ffffffff8104e8cb
#4 [ffff88ff16c738a0] __bad_area_nosemaphore at ffffffff8104eb55
#5 [ffff88ff16c738f0] bad_area_nosemaphore at ffffffff8104ec23
#6 [ffff88ff16c73900] __do_page_fault at ffffffff8104f31c
#7 [ffff88ff16c73a20] do_page_fault at ffffffff8153f8be
#8 [ffff88ff16c73a50] page_fault at ffffffff8153cc05
[exception RIP: get_memory_error_data+0x248]
RIP: ffffffffa02979b8 RSP: ffff88ff16c73b00 RFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8aff1d6b1180
RDX: ffff8aff1d6b1180 RSI: 0000000000000246 RDI: 0000000000000246
RBP: ffff88ff16c73d00 R8: 0000000000000004 R9: ffff88ff16c73acc
R10: 000000000000007c R11: 0000000000000000 R12: ffffffffa0299ba0
R13: ffff88ff16c73bbc R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff88ff16c73d08] sbridge_check_error at ffffffffa029849c [sb_edac]
#10 [ffff88ff16c73e18] edac_mc_workq_function at ffffffffa03d5efa [edac_core]
#11 [ffff88ff16c73e38] worker_thread at ffffffff8109a910
#12 [ffff88ff16c73ee8] kthread at ffffffff810a115e
#13 [ffff88ff16c73f48] kernel_thread at ffffffff8100c28a
Environment
o RHEL 6.7
o Kernel-2.6.32-573*
o Physical Machine
o Intel Broadwell EP or EX processor
o E7-8880 v4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.