NFS kernel crash with RHEL 6.3

Latest response

Hi there,

I also opened a case on the issue but writing to here also...On HP DL580 G7 server we got a crash as below:

WARNING: active task ffff887fec4bc040 on cpu 50: corrupt cpu value: 3708346440

     KERNEL: /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/vmlinux
   DUMPFILE: /var/crash/127.0.0.1-2013-02-08-18:11:42/vmcore  [PARTIAL DUMP]
       CPUS: 80
       DATE: Fri Feb  8 18:07:53 2013
     UPTIME: 2 days, 06:46:03
LOAD AVERAGE: 13.30, 12.83, 11.87
      TASKS: 2473
   NODENAME: ABC.XXX.net
    RELEASE: 2.6.32-279.el6.x86_64
    VERSION: #1 SMP Wed Jun 13 18:24:36 EDT 2012
    MACHINE: x86_64  (2264 Mhz)
     MEMORY: 512 GB
      PANIC: "Oops: 0000 [#1] SMP " (check log for details)
        PID: 78138
    COMMAND: "nfsd"
       TASK: ffff887fec4bc040  [THREAD_INFO: ffff887fdd08e000]
        CPU: 50
      STATE: TASK_UNINTERRUPTIBLE (PANIC)

crash> bt
PID: 78138  TASK: ffff887fec4bc040  CPU: 50  COMMAND: "nfsd"
#0 [ffff8820b0e87be0] machine_kexec at ffffffff8103281b
#1 [ffff8820b0e87c40] crash_kexec at ffffffff810ba662
#2 [ffff8820b0e87d10] oops_end at ffffffff81501290
#3 [ffff8820b0e87d40] die at ffffffff8100f26b
#4 [ffff8820b0e87d70] do_trap at ffffffff81500b84
#5 [ffff8820b0e87dd0] do_invalid_op at ffffffff8100ce35
#6 [ffff8820b0e87e70] invalid_op at ffffffff8100bedb
   [exception RIP: do_nmi+554]
   RIP: ffffffff8150105a  RSP: ffff8820b0e87f28  RFLAGS: 00010002
   RAX: ffff887fdd08ffd8  RBX: ffff8820b0e87f58  RCX: 00000000c0000101
   RDX: 00000000ffff8820  RSI: ffffffffffffffff  RDI: ffff8820b0e87f58
   RBP: ffff8820b0e87f48   R8: 0000000000000000   R9: 0000000000000004
   R10: ffffffff8163c9e0  R11: ffff881ff074c83f  R12: 0000000000000e30
   R13: ffff881ff0758b12  R14: ffff881ff0758c12  R15: ffff881ff0758c12
   ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#7 [ffff8820b0e87f50] nmi at ffffffff815008b0
   [exception RIP: fbcon_redraw+129]
   RIP: ffffffff812c01a1  RSP: ffff887fdd08da98  RFLAGS: 00000086
   RAX: 0000000000000001  RBX: ffff881ff0758c12  RCX: 0000000000000008
   RDX: 0000000000000007  RSI: ffff881ff0758c12  RDI: 0000000000000000
   RBP: ffff887fdd08daf8   R8: 0000000000000000   R9: 0000000000000004
   R10: ffffffff8163c9e0  R11: ffff881ff074c83f  R12: 0000000000000e30
   R13: ffff881ff0758b12  R14: ffff881ff0758c12  R15: ffff881ff0758c12
   ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
#8 [ffff887fdd08da98] fbcon_redraw at ffffffff812c01a1
bt: cannot transition from exception stack to current process stack:
   exception stack pointer: ffff8820b0e87be0
     process stack pointer: ffff887fdd08db00
        current stack base: ffff887fdd08e000
crash>"

I think the problematic function is fbcon_redraw, but it is not related with NFS. How can I further debug the issue as I cannot send vmcore to the support for security reasons...

Thanks,

Responses