System got hung due to possible loss of communication with NFS server.
Issue
-
System got hung due to possible loss of communication with NFS server.
-
5 tasks are in UN state. 293 tasks are in ZO state:
crash> ps -S
RU: 6
UN: 5
IN: 1593
ZO: 293
- Almost all of ZO state tasks are of sshd:
crash> ps -m | grep ZO | less
[ 0 00:03:05.338] [ZO] PID: 5076 TASK: ffff8800414aedd0 CPU: 1 COMMAND: "sshd"
[ 0 00:04:12.527] [ZO] PID: 5071 TASK: ffff880041963ec0 CPU: 1 COMMAND: "sshd"
[ 0 00:09:12.986] [ZO] PID: 5057 TASK: ffff880041811f60 CPU: 2 COMMAND: "sshd"
[ 0 00:14:13.173] [ZO] PID: 5049 TASK: ffff88010fc2ce70 CPU: 0 COMMAND: "sshd"
[ 0 00:19:12.318] [ZO] PID: 5000 TASK: ffff88003c341f60 CPU: 1 COMMAND: "sshd"
[ 0 00:24:12.463] [ZO] PID: 4992 TASK: ffff8800414f0000 CPU: 1 COMMAND: "sshd"
[ 0 00:29:12.588] [ZO] PID: 4985 TASK: ffff880041592f10 CPU: 2 COMMAND: "sshd"
[ 0 00:34:12.803] [ZO] PID: 4970 TASK: ffff88003c340fb0 CPU: 0 COMMAND: "sshd"
[ 0 00:39:12.335] [ZO] PID: 4964 TASK: ffff8800414ade20 CPU: 2 COMMAND: "sshd"
[ 0 00:44:12.518] [ZO] PID: 4955 TASK: ffff8800414a8000 CPU: 1 COMMAND: "sshd"
[ 0 00:49:12.672] [ZO] PID: 4903 TASK: ffff88011146af10 CPU: 1 COMMAND: "sshd"
[ 0 00:54:12.783] [ZO] PID: 4891 TASK: ffff880040a72f10 CPU: 2 COMMAND: "sshd"
[ 0 00:59:12.943] [ZO] PID: 4884 TASK: ffff880040a9de20 CPU: 1 COMMAND: "sshd"
[ 0 01:04:13.065] [ZO] PID: 4874 TASK: ffff88003f9d3ec0 CPU: 2 COMMAND: "sshd"
[ 0 01:09:12.558] [ZO] PID: 4870 TASK: ffff88003f9d2f10 CPU: 0 COMMAND: "sshd"
[ 0 01:14:12.703] [ZO] PID: 4862 TASK: ffff880111c81f60 CPU: 0 COMMAND: "sshd"
[ 0 01:19:12.852] [ZO] PID: 4805 TASK: ffff88003f98af10 CPU: 1 COMMAND: "sshd"
[ 0 01:24:12.972] [ZO] PID: 4797 TASK: ffff88003f98bec0 CPU: 2 COMMAND: "sshd"
[ 0 01:29:13.143] [ZO] PID: 4791 TASK: ffff8801124b5e20 CPU: 0 COMMAND: "sshd"
[ 0 01:34:12.815] [ZO] PID: 4778 TASK: ffff88003c346dd0 CPU: 1 COMMAND: "sshd"
[ 0 01:39:12.405] [ZO] PID: 4770 TASK: ffff88011215af10 CPU: 1 COMMAND: "sshd"
[...]
- UN state tasks:
crash> ps -m | grep UN | tail
[ 1 01:33:20.032] [UN] PID: 1 TASK: ffff880139b20000 CPU: 3 COMMAND: "systemd"
[ 1 02:20:51.015] [UN] PID: 10636 TASK: ffff880137709f60 CPU: 3 COMMAND: "savscand"
[ 1 04:20:51.448] [UN] PID: 13159 TASK: ffff880028054e70 CPU: 2 COMMAND: "WL_ConfSys_pwcp"
[ 1 10:06:13.354] [UN] PID: 44 TASK: ffff88013902ce70 CPU: 1 COMMAND: "fsnotify_mark"
[ 1 10:06:13.335] [UN] PID: 13891 TASK: ffff880137e81f60 CPU: 2 COMMAND: "savscand"
- Backtrace of the oldest UN state task. Looks like it's waiting for the NFS I/O to be completed:
crash> bt 13891
PID: 13891 TASK: ffff880137e81f60 CPU: 2 COMMAND: "savscand"
#0 [ffff880089d63930] __schedule at ffffffff8168b225
#1 [ffff880089d63998] schedule at ffffffff8168b879
#2 [ffff880089d639a8] rpc_wait_bit_killable at ffffffffa02cee24 [sunrpc]
#3 [ffff880089d639c8] __wait_on_bit at ffffffff81689425
#4 [ffff880089d63a08] out_of_line_wait_on_bit at ffffffff816894d1
#5 [ffff880089d63a80] __rpc_wait_for_completion_task at ffffffffa02cedfd [sunrpc]
#6 [ffff880089d63a90] _nfs4_proc_open_confirm at ffffffffa065da08 [nfsv4]
#7 [ffff880089d63b18] _nfs4_open_and_get_state at ffffffffa0665e40 [nfsv4]
#8 [ffff880089d63bc0] nfs4_do_open at ffffffffa0666240 [nfsv4]
#9 [ffff880089d63c88] nfs4_atomic_open at ffffffffa0666747 [nfsv4]
#10 [ffff880089d63ce0] nfs4_file_open at ffffffffa067ae90 [nfsv4]
#11 [ffff880089d63d90] do_dentry_open at ffffffff811fbf07
#12 [ffff880089d63dd8] vfs_open at ffffffff811fc0df
#13 [ffff880089d63e00] dentry_open at ffffffff811fc1a9
#14 [ffff880089d63e38] fanotify_read at ffffffff8124652d
#15 [ffff880089d63f00] vfs_read at ffffffff811fe0ee
#16 [ffff880089d63f38] sys_read at ffffffff811fecbf
#17 [ffff880089d63f80] system_call_fastpath at ffffffff816967c9
RIP: 00007f9079eb122d RSP: 00007f90477fdd30 RFLAGS: 00000293
RAX: 0000000000000000 RBX: ffffffff816967c9 RCX: ffffffffffffffff
RDX: 000000000000002f RSI: 00007f90477fde00 RDI: 0000000000000042
RBP: 0000000058a85a69 R8: 0000000000000000 R9: 0000000000003643
R10: 0000000000000012 R11: 0000000000000293 R12: 0000000000003629
R13: 00007f9048000ef0 R14: 00007f9048000f58 R15: 00007f90477fde00
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
Environment
- Red Hat Enterprise Linux 7.3 (kernel-3.10.0-514.6.1.el7)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
