Kernel Panic with RIP in __audit_syscall_exit+0x3c/0x280 function.

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux 7
  • Kernel 3.10.0-862.14.4.el7.x86_64

Issue

  • Server reboot with kernel panic in function __audit_syscall_exit+0x3c/0x280

Resolution

  • This issue is still under investigation. If similar call traces are observed, please create a new support case and provide the vmcore file for detailed investigation.

Diagnostic Steps

  • Analysis of vmcore

    crash>  sys | grep  -e RELEASE -e PANIC
    RELEASE: 3.10.0-862.14.4.el7.x86_64
    PANIC: "BUG: unable to handle kernel paging request at ffff91467fbe2458"
    
    crash> sys -i |head -n 5
    DMI_BIOS_VENDOR: HPE
    DMI_BIOS_VERSION: U30
    DMI_BIOS_DATE: 02/15/2018
    DMI_SYS_VENDOR: HPE
    DMI_PRODUCT_NAME: ProLiant DL380 Gen10
    
  • Backtraces

    crash> bt
    PID: 222986  TASK: ffff914229fc6eb0  CPU: 5   COMMAND: "jsvc"
    #0 [ffff9142616b7bb8] machine_kexec at ffffffff96062a0a
    #1 [ffff9142616b7c18] __crash_kexec at ffffffff961166c2
    #2 [ffff9142616b7ce8] crash_kexec at ffffffff961167b0
    #3 [ffff9142616b7d00] oops_end at ffffffff9671d728
    #4 [ffff9142616b7d28] no_context at ffffffff9670c84d
    #5 [ffff9142616b7d78] __bad_area_nosemaphore at ffffffff9670c8e4
    #6 [ffff9142616b7dc8] bad_area_nosemaphore at ffffffff9670ca55
    #7 [ffff9142616b7dd8] __do_page_fault at ffffffff967206e0
    #8 [ffff9142616b7e40] do_page_fault at ffffffff967208d5
    #9 [ffff9142616b7e70] page_fault at ffffffff9671c758
    [exception RIP: __audit_syscall_exit+60]
    RIP: ffffffff9613250c  RSP: ffff9142616b7f20  RFLAGS: 00010217
    RAX: 0000000000000001  RBX: ffff91467fbe2400  RCX: 0000000000000000
    RDX: 0000000000000080  RSI: 0000000000000000  RDI: 0000000000000001
    RBP: ffff9142616b7f48   R8: ffff9142616b7eb8   R9: 0000000000000001
    R10: 7fffffffffffffff  R11: 0000000000000000  R12: 0000000000000000
    R13: ffff914229fc6eb0  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff9142616b7f50] sysret_audit at ffffffff9672596e
    RIP: 00007fbc3240a113  RSP: 00007fbc03f535a0  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: 00007fbc03f53610  RCX: 0000000000000001
    RDX: 0000000000002000  RSI: 0000000004731000  RDI: 0000000000000201
    RBP: 00007fbc03f53540   R8: 0000000000000000   R9: 0000000000000001
    R10: 00000000000001f4  R11: 0000000000000293  R12: 00000000000001f4
    R13: 0000000000000201  R14: 0000000000002000  R15: 0000000004731000
    ORIG_RAX: 00000000000000e8  CS: 0033  SS: 002b
    
  • Kernel ring buffer

    [219944.770553] BUG: unable to handle kernel paging request at ffff91467fbe2458
    [219944.770585] IP: [<ffffffff9613250c>] __audit_syscall_exit+0x3c/0x280
    [219944.770608] PGD 58d7e31067 PUD 2ef4631063 PMD ffff917193a9f080 
    [219944.770627] Oops: 0002 [#1] SMP 
    [219944.770638] Modules linked in: macsec tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill 8021q garp mrp stp llc bonding ktap_89422(OE) ext4 mbcache jbd2 skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif hpwdt hpilo pcspkr ses enclosure sg mei_me mei lpc_ich shpchp wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul i40e crct10dif_common crc32c_intel serio_raw smartpqi tg3 scsi_transport_sas
    [219944.770888]  i2c_core ptp pps_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ampnetworkflow]
    [219944.770918] CPU: 5 PID: 222986 Comm: jsvc Kdump: loaded Tainted: G           OE  ------------   3.10.0-862.14.4.el7.x86_64 #1
    [219944.770947] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 02/15/2018
    [219944.770969] task: ffff914229fc6eb0 ti: ffff9142616b4000 task.ti: ffff9142616b4000
    [219944.770988] RIP: 0010:[<ffffffff9613250c>]  [<ffffffff9613250c>] __audit_syscall_exit+0x3c/0x280
    [219944.771013] RSP: 0018:ffff9142616b7f20  EFLAGS: 00010217
    [219944.771027] RAX: 0000000000000001 RBX: ffff91467fbe2400 RCX: 0000000000000000
    [219944.771045] RDX: 0000000000000080 RSI: 0000000000000000 RDI: 0000000000000001
    [219944.771064] RBP: ffff9142616b7f48 R08: ffff9142616b7eb8 R09: 0000000000000001
    [219944.771083] R10: 7fffffffffffffff R11: 0000000000000000 R12: 0000000000000000
    [219944.771101] R13: ffff914229fc6eb0 R14: 0000000000000000 R15: 0000000000000000
    [219944.771120] FS:  00007fbc03f54700(0000) GS:ffff9146bff40000(0000) knlGS:0000000000000000
    [219944.771140] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [219944.771155] CR2: ffff91467fbe2458 CR3: 000000585d708000 CR4: 00000000007607e0
    [219944.771173] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [219944.771191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [219944.771209] PKRU: 55555554
    [219944.771217] Call Trace:
    [219944.771229]  [<ffffffff9672596e>] sysret_audit+0x17/0x21
    [219944.771244] Code: 8b 2c 25 80 0e 01 00 41 54 83 ff 01 19 c0 53 49 8b 9d d8 07 00 00 f7 d0 83 c0 02 48 85 db 0f 84 c3 01 00 00 48 81 fe 01 fe ff ff <89> 43 58 0f 8c 05 02 00 00 48 89 73 48 8b 4b 04 85 c9 74 0a 8b 
    [219944.771348] RIP  [<ffffffff9613250c>] __audit_syscall_exit+0x3c/0x280
    [219944.772157]  RSP <ffff9142616b7f20>
    [219944.772935] CR2: ffff91467fbe2458
    
    
  • Disassembly of exception RIP *__audit_syscall_exit+0x3c/0x280* ffffffff9613250c

    crash> dis -rl ffffffff9613250c
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 1561
    0xffffffff961324d0 <__audit_syscall_exit>:      nopl   0x0(%rax,%rax,1) [FTRACE NOP]
    0xffffffff961324d5 <__audit_syscall_exit+5>:    push   %rbp
    0xffffffff961324d6 <__audit_syscall_exit+6>:    mov    %rsp,%rbp
    0xffffffff961324d9 <__audit_syscall_exit+9>:    push   %r15
    0xffffffff961324db <__audit_syscall_exit+11>:   push   %r14
    0xffffffff961324dd <__audit_syscall_exit+13>:   push   %r13
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/arch/x86/include/asm/current.h: 14
    0xffffffff961324df <__audit_syscall_exit+15>:   mov    %gs:0x10e80,%r13
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 1561
    0xffffffff961324e8 <__audit_syscall_exit+24>:   push   %r12
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 1566
    0xffffffff961324ea <__audit_syscall_exit+26>:   cmp    $0x1,%edi
    0xffffffff961324ed <__audit_syscall_exit+29>:   sbb    %eax,%eax
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 1561
    0xffffffff961324ef <__audit_syscall_exit+31>:   push   %rbx
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 841
    0xffffffff961324f0 <__audit_syscall_exit+32>:   mov    0x7d8(%r13),%rbx
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 1566
    0xffffffff961324f7 <__audit_syscall_exit+39>:   not    %eax
    0xffffffff961324f9 <__audit_syscall_exit+41>:   add    $0x2,%eax
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 843
    0xffffffff961324fc <__audit_syscall_exit+44>:   test   %rbx,%rbx
    0xffffffff961324ff <__audit_syscall_exit+47>:   je     0xffffffff961326c8 <__audit_syscall_exit+504>
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 858
    0xffffffff96132505 <__audit_syscall_exit+53>:   cmp    $0xfffffffffffffe01,%rsi
    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 845
    0xffffffff9613250c <__audit_syscall_exit+60>:   mov    %eax,0x58(%rbx)
    
  • The system was crashed while trying to access audit_context->return_value

    /usr/src/debug/kernel-3.10.0-862.14.4.el7/linux-3.10.0-862.14.4.el7.x86_64/kernel/auditsc.c: 845
    
     845    context->return_valid = return_valid;
    0xffffffff9613250c <__audit_syscall_exit+0x3c>:  mov    %eax,0x58(%rbx)
    
    
  • This %rbx had the below value which is matching with current->audit_context.

    RBX: ffff91467fbe2400
    crash> task_struct.audit_context ffff914229fc6eb0
    audit_context = 0xffff91467fbe2400
    
  • This value is showing valid.

    crash> audit_context.return_valid 0xffff91467fbe2400
    return_valid = 0x1
    
  • The system was crashed with page fault even though the address was valid.

    [219944.770553] BUG: unable to handle kernel paging request at ffff91467fbe2458
    [219944.770585] IP: [<ffffffff9613250c>] __audit_syscall_exit+0x3c/0x280
    [219944.770608] PGD 58d7e31067 PUD 2ef4631063 PMD ffff917193a9f080 
    [219944.770627] Oops: 0002 [#1] SMP 
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments