NFS kernel crash with RHEL 6.3
Hi there,
I also opened a case on the issue but writing to here also...On HP DL580 G7 server we got a crash as below:
WARNING: active task ffff887fec4bc040 on cpu 50: corrupt cpu value: 3708346440
KERNEL: /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2013-02-08-18:11:42/vmcore [PARTIAL DUMP]
CPUS: 80
DATE: Fri Feb 8 18:07:53 2013
UPTIME: 2 days, 06:46:03
LOAD AVERAGE: 13.30, 12.83, 11.87
TASKS: 2473
NODENAME: ABC.XXX.net
RELEASE: 2.6.32-279.el6.x86_64
VERSION: #1 SMP Wed Jun 13 18:24:36 EDT 2012
MACHINE: x86_64 (2264 Mhz)
MEMORY: 512 GB
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
PID: 78138
COMMAND: "nfsd"
TASK: ffff887fec4bc040 [THREAD_INFO: ffff887fdd08e000]
CPU: 50
STATE: TASK_UNINTERRUPTIBLE (PANIC)
crash> bt
PID: 78138 TASK: ffff887fec4bc040 CPU: 50 COMMAND: "nfsd"
#0 [ffff8820b0e87be0] machine_kexec at ffffffff8103281b
#1 [ffff8820b0e87c40] crash_kexec at ffffffff810ba662
#2 [ffff8820b0e87d10] oops_end at ffffffff81501290
#3 [ffff8820b0e87d40] die at ffffffff8100f26b
#4 [ffff8820b0e87d70] do_trap at ffffffff81500b84
#5 [ffff8820b0e87dd0] do_invalid_op at ffffffff8100ce35
#6 [ffff8820b0e87e70] invalid_op at ffffffff8100bedb
[exception RIP: do_nmi+554]
RIP: ffffffff8150105a RSP: ffff8820b0e87f28 RFLAGS: 00010002
RAX: ffff887fdd08ffd8 RBX: ffff8820b0e87f58 RCX: 00000000c0000101
RDX: 00000000ffff8820 RSI: ffffffffffffffff RDI: ffff8820b0e87f58
RBP: ffff8820b0e87f48 R8: 0000000000000000 R9: 0000000000000004
R10: ffffffff8163c9e0 R11: ffff881ff074c83f R12: 0000000000000e30
R13: ffff881ff0758b12 R14: ffff881ff0758c12 R15: ffff881ff0758c12
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff8820b0e87f50] nmi at ffffffff815008b0
[exception RIP: fbcon_redraw+129]
RIP: ffffffff812c01a1 RSP: ffff887fdd08da98 RFLAGS: 00000086
RAX: 0000000000000001 RBX: ffff881ff0758c12 RCX: 0000000000000008
RDX: 0000000000000007 RSI: ffff881ff0758c12 RDI: 0000000000000000
RBP: ffff887fdd08daf8 R8: 0000000000000000 R9: 0000000000000004
R10: ffffffff8163c9e0 R11: ffff881ff074c83f R12: 0000000000000e30
R13: ffff881ff0758b12 R14: ffff881ff0758c12 R15: ffff881ff0758c12
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#8 [ffff887fdd08da98] fbcon_redraw at ffffffff812c01a1
bt: cannot transition from exception stack to current process stack:
exception stack pointer: ffff8820b0e87be0
process stack pointer: ffff887fdd08db00
current stack base: ffff887fdd08e000
crash>"
I think the problematic function is fbcon_redraw, but it is not related with NFS. How can I further debug the issue as I cannot send vmcore to the support for security reasons...
Thanks,