get_memory_error_data でカーネルパニックが発生する
Issue
get_memory_error_data
関数のsb_edac
モジュールでパニックが繰り返し発生します。- システムの起動時または起動後のいずれかのタイミングで断続的にクラッシュが発生します。
crash> sys | grep -i -e rel -e panic
RELEASE:2.6.32-573.38.1.el6.x86_64
PANIC:"BUG: unable to handle kernel NULL pointer dereference at 0000000000000038"
crash> bt
PID:7415 TASK: ffff88ff10e24040 CPU:55 COMMAND:"edac-poller"
#0 [ffff88ff16c736f0] machine_kexec at ffffffff8103d1fb
#1 [ffff88ff16c73750] crash_kexec at ffffffff810cc632
#2 [ffff88ff16c73820] oops_end at ffffffff8153d970
#3 [ffff88ff16c73850] no_context at ffffffff8104e8cb
#4 [ffff88ff16c738a0] __bad_area_nosemaphore at ffffffff8104eb55
#5 [ffff88ff16c738f0] bad_area_nosemaphore at ffffffff8104ec23
#6 [ffff88ff16c73900] __do_page_fault at ffffffff8104f31c
#7 [ffff88ff16c73a20] do_page_fault at ffffffff8153f8be
#8 [ffff88ff16c73a50] page_fault at ffffffff8153cc05
[exception RIP: get_memory_error_data+0x248]
RIP: ffffffffa02979b8 RSP: ffff88ff16c73b00 RFLAGS:00010246
RAX:0000000000000000 RBX:0000000000000000 RCX: ffff8aff1d6b1180
RDX: ffff8aff1d6b1180 RSI:0000000000000246 RDI:0000000000000246
RBP: ffff88ff16c73d00 R8:0000000000000004 R9: ffff88ff16c73acc
R10:000000000000007c R11:0000000000000000 R12: ffffffffa0299ba0
R13: ffff88ff16c73bbc R14:0000000000000000 R15:0000000000000000
ORIG_RAX: ffffffffffffffff CS:0010 SS:0018
#9 [ffff88ff16c73d08] sbridge_check_error at ffffffffa029849c [sb_edac]
#10 [ffff88ff16c73e18] edac_mc_workq_function at ffffffffa03d5efa [edac_core]
#11 [ffff88ff16c73e38] worker_thread at ffffffff8109a910
#12 [ffff88ff16c73ee8] kthread at ffffffff810a115e
#13 [ffff88ff16c73f48] kernel_thread at ffffffff8100c28a
Environment
- RHEL 6.7
- Kernel-2.6.32-573*
- 物理マシン
- Intel Broadwell EP または EX プロセッサー
- E7-8880 v4
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.