Kernel panic at sf_iget_test() function of proprietary module secfs2

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux
  • 3rd party module secfs2

Issue

  • Kernel panic with below message showing in the kernel ring buffer.
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 PGD 0 P4D 0 
 Oops: 0000 [#1] SMP NOPTI
 CPU: 1 PID: 345297 Comm: find Kdump: loaded Tainted: P           OE     -------- -  - 4.18.0-553.36.1.el8_10.x86_64 #1
 RIP: 0010:sf_iget_test+0x4c/0x80 [secfs2]
..
 Call Trace:
  find_inode.isra.26+0x5e/0xd0
  ilookup5_nowait+0x6a/0xa0
  ilookup5.part.30+0x2c/0x90
  iget5_locked+0x26/0x90
  sf_protect+0x108/0x910 [secfs2]
  op_lookup+0x27b/0x3b0 [secfs2]
  __lookup_slow+0x97/0x160
  lookup_slow+0x35/0x50
  walk_component+0x1c3/0x300
  link_path_walk+0x2c1/0x550
  path_lookupat.isra.43+0x9b/0x220
  filename_lookup.part.58+0xa0/0x170
  vfs_statx+0x74/0xe0
  __do_sys_newstat+0x39/0x70
  do_syscall_64+0x5b/0x1a0
  entry_SYSCALL_64_after_hwframe+0x66/0xcb
 RIP: 0033:0x7fdbfec9bb09

Resolution

  • The secfs2 is a proprietary kernel module not shipped by Red Hat.
  • Contact the provider of the third-party kernel module secfs2 for further investigation.

Possible Workaround:

  • Blacklist the third-party kernel module [secfs2].

Root Cause

  • The kernel panic inside sf_iget_test() function of secfs2 module while dereferencing the NULL address.

Diagnostic Steps

  • Stack trace of the task which caused panic
crash> bt
PID: 345297   TASK: ffff8c96e61e2800  CPU: 1    COMMAND: "find"
 #0 [ffffa801d096b600] machine_kexec at ffffffff8406f3a3
 #1 [ffffa801d096b658] __crash_kexec at ffffffff841bacba
 #2 [ffffa801d096b718] crash_kexec at ffffffff841bbbf1
 #3 [ffffa801d096b730] oops_end at ffffffff8402d771
 #4 [ffffa801d096b750] no_context at ffffffff84081da3
 #5 [ffffa801d096b7a8] __bad_area_nosemaphore at ffffffff84082107
 #6 [ffffa801d096b7f0] do_page_fault at ffffffff84082dc7
 #7 [ffffa801d096b820] page_fault at ffffffff84c011fe
    [exception RIP: sf_iget_test+76]
    RIP: ffffffffc1b23cec  RSP: ffffa801d096b8d0  RFLAGS: 00010217
    RAX: 0000000000000001  RBX: ffff8c96fc30c000  RCX: ffff8c96dc70e428
    RDX: 0000000000000000  RSI: ffffa801d096b9c0  RDI: ffff8c96fc30c000
    RBP: ffff8c96d45c3800   R8: ffffa801d096b9c0   R9: 0000000000000000
    R10: ffffffff85322bf9  R11: 0000000000000007  R12: ffffa801d096b9c0
    R13: ffffffffc1b23ca0  R14: ffffa801c02d0fc8  R15: ffffffffc1b20d70
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffa801d096b8d0] find_inode at ffffffff8438e8ce
 #9 [ffffa801d096b908] ilookup5_nowait at ffffffff8438e9aa
#10 [ffffa801d096b930] ilookup5 at ffffffff8438ea0c
#11 [ffffa801d096b968] iget5_locked at ffffffff8438ec76
#12 [ffffa801d096b9a0] sf_protect at ffffffffc1b253c8 [secfs2]
#13 [ffffa801d096ba88] op_lookup at ffffffffc1b2621b [secfs2]
#14 [ffffa801d096bb70] __lookup_slow at ffffffff8437c127
#15 [ffffa801d096bbd0] lookup_slow at ffffffff8437c225
#16 [ffffa801d096bbf8] walk_component at ffffffff8437c403
#17 [ffffa801d096bc58] link_path_walk at ffffffff8437cf11
#18 [ffffa801d096bcb8] path_lookupat at ffffffff8437d2bb
#19 [ffffa801d096bd18] filename_lookup at ffffffff843819c0
#20 [ffffa801d096be40] vfs_statx at ffffffff843745c4
#21 [ffffa801d096be98] __do_sys_newstat at ffffffff84374c19
#22 [ffffa801d096bf38] do_syscall_64 at ffffffff8400549b
#23 [ffffa801d096bf50] entry_SYSCALL_64_after_hwframe at ffffffff84c0012e
    RIP: 00007fdbfec9bb09  RSP: 00007ffed185ebf8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000561a8c4f0f60  RCX: 00007fdbfec9bb09
    RDX: 0000561a8c4f0fd8  RSI: 0000561a8c4f0fd8  RDI: 0000561a8c19a9c0
    RBP: 0000561a8c4f0fd8   R8: 0000561a8c4f0e30   R9: 0000561a8c16d032
    R10: 0000000000000072  R11: 0000000000000246  R12: 0000561a8c19a930
    R13: 0000000000000007  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: 0000000000000004  CS: 0033  SS: 002b
  • When disassembling the code instruction from this exception RIP function, it was performing a "mov" (copy) instruction of %rdx to %rdi. With %rdx address being 0000000000000000:
crash> dis sf_iget_test+76
0xffffffffc1b23cec <sf_iget_test+76>:   mov    (%rdx),%rsi
  • Checking the instruction's virtual address, the instruction is from the secfs2 third-party kernel module code.
crash> sym 0xffffffffc1b23cec
ffffffffc1b23cec (t) sf_iget_test+76 [secfs2] 
  • Third-party modules
crash> mod -t
NAME       TAINTS
secvm2     POE
seccrypto  POE
secfs2     POE   <<----
  • In another case with the same pattern, the system was crashed in the function sf_iget_test which is implemented in the 3rd party module secfs2.
crash> bt
PID: 2472241  TASK: ffff8ce088e50000  CPU: 5    COMMAND: "java"
 #0 [ffffb70980857440] machine_kexec at ffffffff8387a897
 #1 [ffffb70980857498] __crash_kexec at ffffffff839faeba
 #2 [ffffb70980857558] crash_kexec at ffffffff839fbfe8
 #3 [ffffb70980857560] oops_end at ffffffff83831dea
 #4 [ffffb70980857580] page_fault_oops at ffffffff8388c25b
 #5 [ffffb709808575d8] exc_page_fault at ffffffff844d2d62
 #6 [ffffb70980857600] asm_exc_page_fault at ffffffff84600bb2
    [exception RIP: sf_iget_test+0x4c]
    RIP: ffffffffc10b088c  RSP: ffffb709808576b0  RFLAGS: 00010213
    RAX: 0000000000000001  RBX: ffff8ce0a82ae000  RCX: ffff8ce0ada89358
    RDX: 0000000000000000  RSI: ffffb70980857768  RDI: ffff8ce11c2bca60
    RBP: ffff8ce11c2bca60   R8: ffffb70980857768   R9: 0000000000000000
    R10: ffff8ce089e3fb00  R11: ffff8ce0e1812d80  R12: ffffb70980857768
    R13: ffffffffc10b0840  R14: ffff8ce3aeac7880  R15: ffff8ce3aeac7880
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb709808576b0] find_inode at ffffffff83c6e09e
 #8 [ffffb709808576e8] inode_insert5 at ffffffff83c7013c
 #9 [ffffb70980857728] iget5_locked at ffffffff83c7038f
#10 [ffffb70980857760] sf_inode_get.constprop.0 at ffffffffc10afd32 [secfs2]
#11 [ffffb709808577a8] sf_protect at ffffffffc10b4c65 [secfs2]
#12 [ffffb70980857878] op_lookup at ffffffffc10b5c7f [secfs2]
#13 [ffffb70980857960] __lookup_slow at ffffffff83c58aa4
#14 [ffffb709808579b8] walk_component at ffffffff83c5dc38
#15 [ffffb70980857a10] link_path_walk at ffffffff83c5df0e
#16 [ffffb70980857a70] path_lookupat at ffffffff83c5e4be
#17 [ffffb70980857aa8] filename_lookup at ffffffff83c5f93f
#18 [ffffb70980857bc8] vfs_statx at ffffffff83c5111d
#19 [ffffb70980857c20] vfs_fstatat at ffffffff83c51534
#20 [ffffb70980857c48] __do_sys_newstat at ffffffff83c51660
#21 [ffffb70980857d00] do_syscall_64 at ffffffff844ce45f
#22 [ffffb70980857f50] entry_SYSCALL_64_after_hwframe at ffffffff84600130
    RIP: 00007f9561b0e97a  RSP: 00007f9531efd678  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000000000c6741f8  RCX: 00007f9561b0e97a
    RDX: 00007f9531efd680  RSI: 00007f9531efd680  RDI: 000000000bb66600
    RBP: 00007f9531efd730   R8: 0000000000000001   R9: 0000000000000006
    R10: 00007f9550e52a40  R11: 0000000000000246  R12: 0000000004834440
    R13: 0000000000000000  R14: 000000000bb66600  R15: 000000000c674000
    ORIG_RAX: 0000000000000004  CS: 0033  SS: 002b
crash> sym sf_iget_test
ffffffffc10b0840 (t) sf_iget_test [secfs2] 
  • It was crashed while it was trying to access the address pointed by inode.i_private as it has the value 0.
/usr/src/debug/kernel-5.14.0-503.23.2.el9_5/linux-5.14.0-503.23.2.el9_5.x86_64/fs/inode.c: 901
     901        if (!test(inode, data))
     902            continue;
0xffffffff83c6e092 <find_inode+0x52>:   mov    %r12,%rsi
0xffffffff83c6e095 <find_inode+0x55>:   mov    %rbp,%rdi
0xffffffff83c6e098 <find_inode+0x58>:   cs call 0xffffffff844e6480 <__x86_indirect_thunk_r13>



0xffffffffc10b087c <sf_iget_test+0x3c>: mov    0x288(%rdi),%rdx  <-- inode.i_private
0xffffffffc10b0883 <sf_iget_test+0x43>: mov    0x10(%rsi),%rcx
0xffffffffc10b0887 <sf_iget_test+0x47>: cmp    %rcx,%rdx
0xffffffffc10b088a <sf_iget_test+0x4a>: je     0xffffffffc10b085b <sf_iget_test+0x1b>

crash> struct inode.i_private 
struct inode {
  [0x288] void *i_private;
}

crash> struct inode.i_op,i_sb,i_private ffff8ce11c2bca60
  i_op = 0xffffffffc1234f40 <vmfs_atomic_open_iops>,
  i_sb = 0xffff8ce0a82ae000,
  i_private = 0x0   <--- NULL
crash> sym vmfs_atomic_open_iops
ffffffffc1234f40 (?) vmfs_atomic_open_iops [secfs2]
  • This inode is managed by secfs2 as shown in the above and the function that was crashed was also secfs2 function.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments