[RHEL 8][GPFS]crash in filp_close() when dereferencing a null filp- >f_op pointer

Solution Verified - Updated -

Issue

System crashes with console messages:

[120692.713335] Faulting instruction address: 0xc0000000005895a8
[120692.713340] Oops: Kernel access of bad area, sig: 7 [#1]
[120692.713345] LE SMP NR_CPUS=2048 NUMA PowerNV
[120692.713353] Modules linked in: nf_tables libcrc32c nfnetlink mmfs26(OE) mmfslinux(OE) tracedev(OE) mptcp_diag xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag 8021q garp stp mrp llc bonding uio_pci_generic vfio_pci vfio_virqfd vfio_iommu_spapr_tce vfio vfio_spapr_eeh cuse fuse rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) i2c_dev ib_ipoib(OE) ib_cm(OE) ib_umad(OE) ses enclosure scsi_transport_sas xts ipmi_powernv vmx_crypto ipmi_devintf ipmi_msghandler uio_pdrv_genirq uio leds_powernv powernv_op_panel ibmpowernv auth_rpcgss sunrpc knem(OE) binfmt_misc ext4 mbcache jbd2 mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) sd_mod mlx5_core(OE) mlxdevm(OE) ipr mlx_compat(OE) libata psample mlxfw(OE) tg3 tls dm_mirror dm_region_hash dm_log dm_mod opal_prd nvme nvme_core t10_pi sg
[120692.713450] CPU: 116 PID: 689024 Comm: smbd[<IP_address> Kdump: loaded Tainted: G           OE     -------- -  - 4.18.0-553.22.1.el8_10.ppc64le #1
[120692.713457] NIP:  c0000000005895a8 LR: c0000000005d064c CTR: 0000000000000000
[120692.713462] REGS: c00000131b5e3980 TRAP: 0300   Tainted: G           OE     -------- -  -  (4.18.0-553.22.1.el8_10.ppc64le)
[120692.713469] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24242204  XER: 20040000
[120692.713481] CFAR: c0000000005d0648 DAR: 0000000000000078 DSISR: 00080000 IRQMASK: 0 
                GPR00: c0000000005d064c c00000131b5e3c10 c00000000220f300 c000000f92aab500 
                GPR04: c000000180936f00 0000000000000001 c00000131b5e3be8 0000000000042088 
                GPR08: 0000000000002000 0000000000000000 0000000000000120 0000201ffac50000 
                GPR12: 0000000024242202 c000201fffe26800 0000000055823118 0000000000000000 
                GPR16: 0000000000000000 0000000000000000 0000000000000010 00000000bb9fede5 
                GPR20: 0000000000000002 00007fffb7665070 0000000000000003 0000000000000001 
                GPR24: 00007fffb76608a8 0000000000000000 c000000180936f00 0000000000000005 
                GPR28: 0000000000000025 0000000000000000 c000000180936f00 c000000f92aab500 
[120692.713534] NIP [c0000000005895a8] filp_close+0x48/0xe0
[120692.713544] LR [c0000000005d064c] put_files_struct.part.4+0xdc/0x190
[120692.713551] Call Trace:
[120692.713553] [c00000131b5e3c10] [c00000131b5e3c60] 0xc00000131b5e3c60 (unreliable)
[120692.713559] [c00000131b5e3c90] [c0000000005d064c] put_files_struct.part.4+0xdc/0x190
[120692.713565] [c00000131b5e3cf0] [c00000000017a8a8] do_exit+0x498/0xd60
[120692.713571] [c00000131b5e3dc0] [c00000000017b240] do_group_exit+0x60/0x110
[120692.713577] [c00000131b5e3e00] [c00000000017b314] sys_exit_group+0x24/0x30
[120692.713582] [c00000131b5e3e20] [c00000000000b408] system_call+0x5c/0x70
[120692.713589] Instruction dump:
[120692.713593] f8010010 f821ff81 e9230038 2fa90000 419e0098 fbc10070 fbe10078 f8410018 
[120692.713603] 7c7f1b78 7c9e2378 3ba00000 e9230028 <e9290078> 2fa90000 419e0018 7d2c4b78 
[120692.713616] ---[ end trace e349243b1cfd4f50 ]---

The kernel panic stack trace is:

crash> bt
PID: 689024   TASK: c000001315b89000  CPU: 116  COMMAND: "smbd[<IP_address>"
 R0:  c0000000000e3a08    R1:  c00000010e8fbc60    R2:  c00000000220f300   
 R3:  c00000010e8fbae8    R4:  c0000000000dbe40    R5:  0000000000000000   
 R6:  0000000000000001    R7:  0000000000000003    R8:  0000000000000002   
 R9:  0000000000000000    R10: 00000000000000ff    R11: c000201ffff9c000   
 R12: 0000000031d48000    R13: c000001fff7d7880    R14: c00000010e8fbf90   
 R15: 0000000000000000    R16: 0000000000000000    R17: 0000000000000000   
 R18: 0000000000000000    R19: 0000000000000000    R20: 0000000000000000   
 R21: 0000000000000000    R22: 0000000000000000    R23: 0000000000000001   
 R24: 0000000028000000    R25: 0000000000100000    R26: 00000000000c0000   
 R27: 0000000000200000    R28: 00000000003c0000    R29: 00000000000000e8   
 R30: 000000000000001d    R31: 9000000000121033   
 NIP: c0000000000e3a08    MSR: 9000000000001033    OR3: 0000000000100000
 CTR: c000000000008000    LR:  c0000000000e3a08    XER: 0000000020040000
 CCR: 0000000048004422    MQ:  000000000000001d    DAR: c00000010e8fbc60
 DSISR: 0000000000000000     Syscall Result: c0000000000dcaac
 [NIP  : pnv_smp_cpu_kill_self+0x308]
 [LR   : pnv_smp_cpu_kill_self+0x308]
 #0 [c00000131b5e37e0] crash_kexec at c0000000002a2680
 #1 [c00000131b5e3820] oops_end at c000000000021ad8
 #2 [c00000131b5e38a0] bad_page_fault at c00000000008663c
 #3 [c00000131b5e3910] handle_page_fault at c00000000000a760
 Data Access [300] exception frame:
 R0:  c0000000005d064c    R1:  c00000131b5e3c10    R2:  c00000000220f300   
 R3:  c000000f92aab500    R4:  c000000180936f00    R5:  0000000000000001   
 R6:  c00000131b5e3be8    R7:  0000000000042088    R8:  0000000000002000   
 R9:  0000000000000000    R10: 0000000000000120    R11: 0000201ffac50000   
 R12: 0000000024242202    R13: c000201fffe26800    R14: 0000000055823118   
 R15: 0000000000000000    R16: 0000000000000000    R17: 0000000000000000   
 R18: 0000000000000010    R19: 00000000bb9fede5    R20: 0000000000000002   
 R21: 00007fffb7665070    R22: 0000000000000003    R23: 0000000000000001   
 R24: 00007fffb76608a8    R25: 0000000000000000    R26: c000000180936f00   
 R27: 0000000000000005    R28: 0000000000000025    R29: 0000000000000000   
 R30: c000000180936f00    R31: c000000f92aab500   
 NIP: c0000000005895a8    MSR: 9000000000009033    OR3: c0000000005d0648
 CTR: 0000000000000000    LR:  c0000000005d064c    XER: 0000000020040000
 CCR: 0000000024242204    MQ:  0000000000000000    DAR: 0000000000000078
 DSISR: 0000000000080000     Syscall Result: 0000000000000000
 [NIP  : filp_close+0x48]
 [LR   : put_files_struct+0xdc]
 #4 [c00000131b5e3c10] filp_close at c0000000005895a8
 #5 [c00000131b5e3c90] put_files_struct at c0000000005d064c  (unreliable)
 #6 [c00000131b5e3cf0] do_exit at c00000000017a8a8
 #7 [c00000131b5e3dc0] do_group_exit at c00000000017b240
 #8 [c00000131b5e3e00] sys_exit_group at c00000000017b314
 #9 [c00000131b5e3e20] system_call at c00000000000b408
 System Call [c00] exception frame:
 R0:  00000000000000ea    R1:  00007fffd0407830    R2:  00007fffb7667300   
 R3:  0000000000000000    R4:  0000000000000000    R5:  0000000000000000   
 R6:  0000000000000001    R7:  0000000000000000    R8:  0000000000000000   
 R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000   
 R12: 0000000000000000    R13: 00007fffb42f0c90    R14: 0000000055823118   
 R15: 0000000000000000    R16: 0000000000000000    R17: 0000000000000000   
 R18: 0000000000000010    R19: 00000000bb9fede5    R20: 0000000000000002   
 R21: 00007fffb7665070    R22: 0000000000000003    R23: 0000000000000001   
 R24: 00007fffb76608a8    R25: 0000000000000000    R26: 0000000000000000   
 R27: 0000000000000001    R28: 0000000000000000    R29: 0000000000000000   
 R30: 00007fffb42e9cd8    R31: 0000000000000000   
 NIP: 00007fffb755387c    MSR: 900000000280f033    OR3: 0000000000000000
 CTR: 0000000000000000    LR:  00007fffb74acff4    XER: 0000000000000000
 CCR: 0000000044242402    MQ:  0000000000000000    DAR: 00007fff9e1b0118
 DSISR: 0000000002040000     Syscall Result: 0000000000000000

Environment

  • Red Hat Enterprise Linux 8
  • PowerPC LE CPU architecture
  • node part of the IBM Spectrum Scale Storage cluster which uses
  • GPFS filesystem

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content