GPFS (Spectrum Scale) 7.3 ppc64 lpars are crashing
Issue
- Frequent crashes on ppc64 lpars systems with GPFS setup (Spectrum Scale)
crash instance 1:
crash> ps -p 6318
PID: 0 TASK: c000000001358180 CPU: 0 COMMAND: "swapper/0"
PID: 1 TASK: c000001ef9d00000 CPU: 31 COMMAND: "systemd"
PID: 41045 TASK: c000000e9e8d5300 CPU: 10 COMMAND: "runmmfs"
PID: 68294 TASK: c000000e9f766670 CPU: 12 COMMAND: "mmfsd" <
PID: 6296 TASK: c000000f4da56830 CPU: 66 COMMAND: "mmcommon"
PID: 6317 TASK: c000001ee9befac0 CPU: 41 COMMAND: "umount"
PID: 6318 TASK: c000001dc63779e0 CPU: 57 COMMAND: "umount.nfs" <
crash>
crash> bt 6318
PID: 6318 TASK: c000001dc63779e0 CPU: 57 COMMAND: "umount.nfs"
#0 [c000001dca6633a0] .crash_kexec at c000000000196284
#1 [c000001dca663420] .die at c000000000020cd8
#2 [c000001dca6634d0] ._exception at c000000000020ff4
#3 [c000001dca663670] .program_check_exception at c00000000097d748
#4 [c000001dca663700] program_check_common at c000000000006308
Program Check [700] exception frame:
R0: c000000000341868 R1: c000001dca6639f0 R2: c000000001429b00
R3: c00000051bc5b280 R4: f0000000002bf480 R5: c0000000c8f0e400
R6: c0000000c8f0e400 R7: 0000000000000001 R8: 0000000000000000
R9: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000
R12: c000000000346cc0 R13: c000000007b40100 R14: 0000000000000000
R15: 0000000000000000 R16: 0000000000000000 R17: 0000000000000000
R18: 0000000000000000 R19: 0000000000000000 R20: 0000000000000000
R21: c0000000012fdf88 R22: 0000000000000000 R23: c0000000017df938
R24: 0000000000000000 R25: c000001ef5ed9000 R26: 0000000000000000
R27: 00000100290a01e0 R28: d00000001dfae7c0 R29: 0000000000000043
R30: c000001ef5ed8800 R31: c00000051bc5b280
NIP: c00000000033cd90 MSR: 8000000002029032 OR3: c000000000341864
CTR: d00000001df75470 LR: c000000000341868 XER: 0000000000000000
CCR: 0000000084002822 MQ: 0000000000000001 DAR: 0000000000000000
DSISR: f0000000002bf480 Syscall Result: 0000000000000000
#5 [c000001dca6639f0] .shrink_dcache_for_umount_subtree at c00000000033cd90
[Link Register] [c000001dca6639f0] .shrink_dcache_for_umount at c000000000341868
#6 [c000001dca663aa0] .shrink_dcache_for_umount at c000000000341868 (unreliable)
#7 [c000001dca663b20] .kill_anon_super at c00000000031a1f8
#8 [c000001dca663bb0] fscache_n_op_requeue at d00000001df83bb0 [nfs]
#9 [c000001dca663c30] .deactivate_locked_super at c00000000031af30
#10 [c000001dca663cb0] .mntput_no_expire at c00000000034e340
#11 [c000001dca663d40] .sys_umount at c0000000003506d4
#12 [c000001dca663e30] system_call at c00000000000a17c
System Call [c00] exception frame:
R0: 0000000000000034 R1: 00003fffe9994bc0 R2: 00003fffae454700
crash instance 2:
crash> ps -p 13272
PID: 0 TASK: c000000001358180 CPU: 0 COMMAND: "swapper/0"
PID: 1 TASK: c000001ef9980000 CPU: 62 COMMAND: "systemd"
PID: 60123 TASK: c0000017109226e0 CPU: 74 COMMAND: "runmmfs"
PID: 64877 TASK: c00000008aea6670 CPU: 21 COMMAND: "mmfsd" <
PID: 13167 TASK: c000000f6475e750 CPU: 4 COMMAND: "mmcommon"
PID: 13270 TASK: c000001eeccee590 CPU: 2 COMMAND: "umount"
PID: 13272 TASK: c000000ef12dfc80 CPU: 2 COMMAND: "umount.nfs" <
crash> bt
PID: 13272 TASK: c000000ef12dfc80 CPU: 2 COMMAND: "umount.nfs"
#0 [c000000eba60b3a0] .crash_kexec at c000000000196284
#1 [c000000eba60b420] .die at c000000000020cd8
#2 [c000000eba60b4d0] ._exception at c000000000020ff4
#3 [c000000eba60b670] .program_check_exception at c00000000097d748
#4 [c000000eba60b700] program_check_common at c000000000006308
Program Check [700] exception frame:
R0: c000000000341868 R1: c000000eba60b9f0 R2: c000000001429b00
R3: c000000fc9ac8940 R4: f00000000473c618 R5: c00000145a65c000
R6: c00000145a65c000 R7: 0000000000000001 R8: 0000000000000000
R9: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000
R12: c000000000346cc0 R13: c000000007b21200 R14: 0000000000000000
R15: 0000000000000000 R16: 0000000000000000 R17: 0000000000000000
R18: 0000000000000000 R19: 0000000000000000 R20: 0000000000000000
R21: c0000000012fdf88 R22: 0000000000000000 R23: c0000000017df938
R24: 0000000000000000 R25: c000000ed39a3800 R26: 0000000000000000
R27: 000001003c9a01e0 R28: d00000002df1e7c0 R29: 000000000000009e
R30: c000000ed39a2000 R31: c000000fc9ac8940
NIP: c00000000033cd90 MSR: 8000000000029032 OR3: c000000000341864
CTR: d00000002dee5470 LR: c000000000341868 XER: 0000000000000000
CCR: 0000000088002822 MQ: 0000000000000001 DAR: 0000000000000000
DSISR: f00000000473c618 Syscall Result: 0000000000000000
#5 [c000000eba60b9f0] .shrink_dcache_for_umount_subtree at c00000000033cd90
[Link Register] [c000000eba60b9f0] .shrink_dcache_for_umount at c000000000341868
#6 [c000000eba60baa0] .shrink_dcache_for_umount at c000000000341868 (unreliable)
#7 [c000000eba60bb20] .kill_anon_super at c00000000031a1f8
#8 [c000000eba60bbb0] fscache_n_op_requeue at d00000002def3bb0 [nfs]
#9 [c000000eba60bc30] .deactivate_locked_super at c00000000031af30
#10 [c000000eba60bcb0] .mntput_no_expire at c00000000034e340
#11 [c000000eba60bd40] .sys_umount at c0000000003506d4
#12 [c000000eba60be30] system_call at c00000000000a17c
System Call [c00] exception frame:
Environment
- Red Hat Enterprise Linux 7
- GPFS software - General Parallel File System, for clustered environment
mmfs26 and mmfslinux kernel modules - part of GPFS
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.