Kernel crash with "general protection fault, probably for non-canonical address" in svc_generic_init_request()
Environment
- Red Hat Enterprise Linux 8
Issue
- The crash happened in
svc_generic_init_requestwhen trying to deference address0x8b08478908474e47.
Resolution
- At present, no definitive solution has been identified.
- Enable SLUB debugging (slub_debug=FZP) via the kernel command line, reproduce the issue, and provide the new vmcore for detailed analysis.
Workaround
For guest systems: Contact the hypervisor vendor to assess the possibility of emulation-related or other underlying host issues. If feasible, consider migrating the guest VM to a different ESXi host and monitor for recurrence of the issue.
For physical systems: This indicates a potential CPU hardware fault. Engage respective hardware vendor to perform comprehensive diagnostics and replace any identified faulty components to rule out hardware-related issues.
Root Cause
- The crash happened in
svc_generic_init_request()when trying to deference address0x8b08478908474e47while performing standard NFS routines, the system crashed when a garbage value was found in a CPU register. Unfortunately it's not sure how it got into this CPU register is unclear but it theoretically shouldn't be possible via software, which leads to believe this is likely a Hardware/VMware emulation issue.
Diagnostic Steps
- System Information:
crash> sys | grep -i "RELEASE\|PANIC"
RELEASE: 4.18.0-513.24.1.el8_9.x86_64
PANIC: "general protection fault, probably for non-canonical address 0x8b08478908474e47: 0000 [#1] SMP NOPTI"
- Kernel panic with below messages in the kernel ring buffer.
[ 2295.638340] general protection fault, probably for non-canonical address 0x8b08478908474e47: 0000 [#1] SMP NOPTI
[ 2295.638454] CPU: 0 PID: 2383 Comm: nfsd Kdump: loaded Not tainted 4.18.0-513.24.1.el8_9.x86_64 #1
[ 2295.638488] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[ 2295.638526] RIP: 0010:svc_generic_init_request+0x87/0x110 [sunrpc]
[ 2295.638609] Code: 00 48 29 c1 48 89 c8 48 8b 4d 08 4c 8d 24 c1 4c 89 a7 48 01 00 00 4d 85 e4 74 6a 48 89 fb 49 89 d5 48 8b
bf 90 2b 00 00 31 f6 <41> 8b 54 24 20 e8 3f c1 77 c1 41 8b 54 24 24 48 8b bb 98 2b 00 00
[ 2295.638692] RSP: 0018:ffffc90004557e60 EFLAGS: 00010246
[ 2295.638718] RAX: 0000000000000007 RBX: ffff88810d070000 RCX: 8b08478908474e0f
[ 2295.638744] RDX: ffffc90004557e98 RSI: 0000000000000000 RDI: ffff88810d074000
[ 2295.638804] RBP: ffffffffc0379100 R08: 0000000000000008 R09: 0000000000000001
[ 2295.638838] R10: 000000004503b7d6 R11: 0000000012000000 R12: 8b08478908474e47
[ 2295.638883] R13: ffffc90004557e98 R14: ffff888110f69300 R15: 00000000000186a3
[ 2295.638927] FS: 0000000000000000(0000) GS:ffff888274e00000(0000) knlGS:0000000000000000
[ 2295.638962] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2295.638990] CR2: 00007fc2ce76a000 CR3: 0000000002a10004 CR4: 00000000007706f0
[ 2295.639045] PKRU: 55555554
[ 2295.639057] Call Trace:
[ 2295.639083] ? __die_body+0x1a/0x60
[ 2295.639104] ? die_addr+0x38/0x51
[ 2295.639120] ? do_general_protection+0x135/0x280
[ 2295.639174] ? general_protection+0x1e/0x30
[ 2295.639253] ? svc_generic_init_request+0x87/0x110 [sunrpc]
[ 2295.639349] svc_process_common+0x2e8/0x5c0 [sunrpc]
[ 2295.639440] ? svc_sock_secure_port+0x12/0x40 [sunrpc]
[ 2295.639529] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 2295.639618] svc_process+0xb7/0xf0 [sunrpc]
[ 2295.639703] nfsd+0xe3/0x140 [nfsd]
[ 2295.639781] kthread+0x134/0x150
[ 2295.639849] ? set_kthread_struct+0x50/0x50
[ 2295.639917] ret_from_fork+0x1f/0x40
[ 2295.639986] Modules linked in: mptcp_diag xsk_diag vsock_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag nfnetlink_queue nfnetlink_log nft_log nft_ct nf_tables_set xt_LOG nf_log_syslog nf_conntrack_tftp nf_conntrack_ftp cfg80211 rfkill xt_REDIRECT nft_counter nft_chain_nat xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 vsock_loopback nf_defrag_ipv4 vmw_vsock_virtio_transport_common nft_compat nf_tables vmw_vsock_vmci_transport libcrc32c vsock nfnetlink loop intel_rapl_msr intel_rapl_common intel_uncore_frequency_common isst_if_mbox_msr isst_if_common vmwgfx nfit libnvdimm drm_ttm_helper crct10dif_pclmul crc32_pclmul ttm ghash_clmulni_intel drm_kms_helper syscopyarea rapl sysfillrect sysimgblt vmw_balloon drm joydev pcspkr i2c_piix4 vmw_vmci nfsd auth_rpcgss nfs_acl lockd grace sunrpc ext4 mbcache jbd2 sd_mod t10_pi sg ata_generic ata_piix libata crc32c_intel vmxnet3 serio_raw vmw_pvscsi dm_mirror dm_region_hash dm_log dm_mod softdog fuse
[ 2295.640654] Red Hat flags: eBPF/LSM eBPF/event eBPF/rawtrace
- Backtrace of the panic task.
crash> set -p
PID: 2383
COMMAND: "nfsd"
TASK: ffff888131280000 [THREAD_INFO: ffff888131280000]
CPU: 0
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 2383 TASK: ffff888131280000 CPU: 0 COMMAND: "nfsd"
#0 [ffffc90004557bd0] machine_kexec at ffffffff8106da63
#1 [ffffc90004557c28] __crash_kexec at ffffffff811b86ca
#2 [ffffc90004557ce8] crash_kexec at ffffffff811b9601
#3 [ffffc90004557d00] oops_end at ffffffff8102be31
#4 [ffffc90004557d20] do_general_protection at ffffffff81028915
#5 [ffffc90004557db0] general_protection at ffffffff81c0117e
[exception RIP: svc_generic_init_request+135]
RIP: ffffffffc0284e87 RSP: ffffc90004557e60 RFLAGS: 00010246
RAX: 0000000000000007 RBX: ffff88810d070000 RCX: 8b08478908474e0f
RDX: ffffc90004557e98 RSI: 0000000000000000 RDI: ffff88810d074000
RBP: ffffffffc0379100 R8: 0000000000000008 R9: 0000000000000001
R10: 000000004503b7d6 R11: 0000000012000000 R12: 8b08478908474e47
R13: ffffc90004557e98 R14: ffff888110f69300 R15: 00000000000186a3
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#6 [ffffc90004557e80] svc_process_common at ffffffffc02852a8 [sunrpc]
#7 [ffffc90004557ed8] svc_process at ffffffffc0285637 [sunrpc]
#8 [ffffc90004557ef0] nfsd at ffffffffc0341663 [nfsd]
#9 [ffffc90004557f10] kthread at ffffffff8111fb44
#10 [ffffc90004557f50] ret_from_fork at ffffffff81c0028f
- Disassemble the function where kernel panic occurred.
crash> dis -rl ffffffffc0284e87 | tail
0xffffffffc0284e64 <svc_generic_init_request+100>: mov 0x8(%rbp),%rcx
0xffffffffc0284e68 <svc_generic_init_request+104>: lea (%rcx,%rax,8),%r12
0xffffffffc0284e6c <svc_generic_init_request+108>: mov %r12,0x148(%rdi)
0xffffffffc0284e73 <svc_generic_init_request+115>: test %r12,%r12
0xffffffffc0284e76 <svc_generic_init_request+118>: je 0xffffffffc0284ee2 <svc_generic_init_request+226>
0xffffffffc0284e78 <svc_generic_init_request+120>: mov %rdi,%rbx
0xffffffffc0284e7b <svc_generic_init_request+123>: mov %rdx,%r13
0xffffffffc0284e7e <svc_generic_init_request+126>: mov 0x2b90(%rdi),%rdi
0xffffffffc0284e85 <svc_generic_init_request+133>: xor %esi,%esi
0xffffffffc0284e87 <svc_generic_init_request+135>: mov 0x20(%r12),%edx
0xffffffffc0284e64 <svc_generic_init_request+0x64>: mov 0x8(%rbp),%rcx
crash> sym ffffffffc0379100
ffffffffc0379100 (B) nfsd_version4 [nfsd]
crash> rd ffffffffc0379100 -o 0x8
ffffffffc0379108: ffffffffc0379140 @.7.....
0xffffffffc0284e68 <svc_generic_init_request+0x68>: lea (%rcx,%rax,8),%r12
RCX: 8b08478908474e0f
0xffffffffc0284e6c <svc_generic_init_request+0x6c>: mov %r12,0x148(%rdi)
0xffffffffc0284e73 <svc_generic_init_request+0x73>: test %r12,%r12
0xffffffffc0284e76 <svc_generic_init_request+0x76>: je 0xffffffffc0284ee2 <svc_generic_init_request+0xe2>
0xffffffffc0284e78 <svc_generic_init_request+0x78>: mov %rdi,%rbx
0xffffffffc0284e7b <svc_generic_init_request+0x7b>: mov %rdx,%r13
0xffffffffc0284e7e <svc_generic_init_request+0x7e>: mov 0x2b90(%rdi),%rdi
0xffffffffc0284e85 <svc_generic_init_request+0x85>: xor %esi,%esi
0xffffffffc0284e87 <svc_generic_init_request+0x87>: mov 0x20(%r12),%edx
ffff8880036820d8: 8b08478908474e0f
ffff888106ddf108: 8b08478908474e0f
ffff88810d06eb50: 8b08478908474e0f
ffff88810d06ec88: 8b08478908474e0f
ffff88810d06ee10: 8b08478908474e0f
ffffc90004557b50: 8b08478908474e0f
ffffc90004557c88: 8b08478908474e0f
ffffc90004557e10: 8b08478908474e0f
ffffffffc049a108: 8b08478908474e0f
ffffffff836820d8: 8b08478908474e0f
crash> sym ffffffffc049a108
ffffffffc049a108 (T) drm_rect_intersect+0x28 [drm_kms_helper] /usr/src/debug/kernel-4.18.0-513.24.1.el8_9/linux-4.18.0-513.24.1.el8_9.x86_64/drivers/gpu/drm/drm_rect.c: 48
these are cpu instructions?
0xffffffffc049a108 <+40>: 0f 4e 47 08 cmovle 0x8(%rdi),%eax
0xffffffffc049a10c <+44>: 89 47 08 mov %eax,0x8(%rdi)
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments