RHEL7.5: Linux NFS server kernel crashes in rx interrupt handler accessing svc_sock shortly after "nfsd: last server has exited, flushing export cache" seen in the log
Issue
Periodically we see a message about files limits reached, VFS: file-max limit 19513468 reached. Then sometime after this message, and almost immediately after "nfsd: last server has exited, flushing export cache" message, the machine crashes in an Rx side interrupt handler for the ixgbe driver. It was handling a TCP receive and called into svc_tcp_listen_data_ready.
crash> bt
PID: 0 TASK: ffff9ce73a284f10 CPU: 2 COMMAND: "swapper/2"
#0 [ffff9ce9cfc83730] machine_kexec at ffffffffb2e6178a
#1 [ffff9ce9cfc83790] __crash_kexec at ffffffffb2f13bf2
#2 [ffff9ce9cfc83860] crash_kexec at ffffffffb2f13ce0
#3 [ffff9ce9cfc83878] oops_end at ffffffffb3518728
#4 [ffff9ce9cfc838a0] die at ffffffffb2e2e96b
#5 [ffff9ce9cfc838d0] do_general_protection at ffffffffb35180de
#6 [ffff9ce9cfc83900] general_protection at ffffffffb35176f8
[exception RIP: __x86_indirect_thunk_rax+0xa]
RIP: ffffffffb315b00a RSP: ffff9ce9cfc839b0 RFLAGS: 00010286
RAX: 310000009edd0402 RBX: ffff9ce9cc287000 RCX: 00000001b6b4cb71
RDX: 00000001b6b4cb70 RSI: 0000000000000000 RDI: ffff9ce5c7878f80
RBP: ffff9ce9cfc839d0 R8: 000000000001bb00 R9: ffffffffb33d5457
R10: ffff9ce9cfc9bb00 R11: fffff41b40772080 R12: ffff9ce5c7878f80
R13: 0000000000000000 R14: ffff9ce5ddc82b00 R15: ffff9ce5c7879648
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff9ce9cfc839b0] svc_tcp_listen_data_ready at ffffffffc04137ab [sunrpc]
#8 [ffff9ce9cfc839d8] tcp_data_queue at ffffffffb3449ead
#9 [ffff9ce9cfc83a30] tcp_rcv_established at ffffffffb344d1e7
#10 [ffff9ce9cfc83a80] tcp_v4_do_rcv at ffffffffb3457e6a
#11 [ffff9ce9cfc83ac8] tcp_v4_rcv at ffffffffb34595f8
#12 [ffff9ce9cfc83b60] ip_local_deliver_finish at ffffffffb3432729
#13 [ffff9ce9cfc83b88] ip_local_deliver at ffffffffb3432a19
#14 [ffff9ce9cfc83be0] ip_rcv_finish at ffffffffb3432390
#15 [ffff9ce9cfc83c08] ip_rcv at ffffffffb3432d49
#16 [ffff9ce9cfc83c70] __netif_receive_skb_core at ffffffffb33ecab9
#17 [ffff9ce9cfc83ce0] __netif_receive_skb at ffffffffb33ecdc8
#18 [ffff9ce9cfc83d00] netif_receive_skb_internal at ffffffffb33ece50
#19 [ffff9ce9cfc83d30] napi_gro_receive at ffffffffb33eda78
#20 [ffff9ce9cfc83d58] ixgbe_clean_rx_irq at ffffffffc03a5360 [ixgbe]
#21 [ffff9ce9cfc83de0] ixgbe_poll at ffffffffc03a660e [ixgbe]
#22 [ffff9ce9cfc83e78] net_rx_action at ffffffffb33ed46f
#23 [ffff9ce9cfc83ef8] __do_softirq at ffffffffb2e9b085
#24 [ffff9ce9cfc83f68] call_softirq at ffffffffb3523cec
#25 [ffff9ce9cfc83f80] do_softirq at ffffffffb2e2d625
#26 [ffff9ce9cfc83fa0] irq_exit at ffffffffb2e9b405
#27 [ffff9ce9cfc83fb8] do_IRQ at ffffffffb3524f86
--- <IRQ stack> ---
#28 [ffff9ce73a29bdb8] ret_from_intr at ffffffffb3517362
[exception RIP: cpuidle_enter_state+0x54]
RIP: ffffffffb3369e84 RSP: ffff9ce73a29be60 RFLAGS: 00000202
RAX: 000ad9879bbe3afe RBX: ffff9ce73a29be40 RCX: 0000000000000018
RDX: 0000000225c17d03 RSI: 0000000000000002 RDI: 000ad9879bbe3afe
RBP: ffff9ce73a29be88 R8: 000000000000037a R9: 0000000000000018
R10: 00000000000003d3 R11: 7fffffffffffffff R12: 0000000000000002
R13: ffff9ce9cfc93a20 R14: ffffffffb2ebee35 R15: ffff9ce73a29bde0
ORIG_RAX: ffffffffffffff97 CS: 0010 SS: 0018
#29 [ffff9ce73a29be90] cpuidle_idle_call at ffffffffb3369fde
#30 [ffff9ce73a29bed0] arch_cpu_idle at ffffffffb2e356de
#31 [ffff9ce73a29bee0] cpu_startup_entry at ffffffffb2ef335a
#32 [ffff9ce73a29bf28] start_secondary at ffffffffb2e55f97
#33 [ffff9ce73a29bf50] start_cpu at ffffffffb2e000d5
Environment
- Red Hat Enterprise Linux 7.5 (NFS server)
- seen on 3.10.0-862.6.3.el7
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
