System gets hung while reboot due to in progress NFS READ or WRITE operations, even though NFS server is available
Environment
- Red Hat Enterprise Linux 6 (NFS client)
- seen on kernels 2.6.32-431.el6 , 2.6.32-358.32.3.el6 and 2.6.32-504.el6.x86_64
- other kernels likely affected
- Red Hat Enterprise Linux 5 (NFS client)
- seen on 2.6.18-400.1.1.el5
- Any NFS server
- application issuing heavy IO to NFS while rebooting
- Problem does not occur on RHEL7, or Fedora20 + kernel-3.16.0-rc7
Issue
- NFS Client machine hangs while rebooting, when I/O is in progress on a NFS mount.
- The same test on RHEL5 or RHEL7 does not hang during reboot.
- The 'reboot' or 'halt' command is hung in
sync_filesystems
- Server reboot/shutdown hangs with NFS server not responding message
- Umount of autofs hangs up in the system shutdown
Resolution
- Update to
initscripts-9.03.49-1.el6
which was made available via RHBA-2015:1380-2.- Please note that this newer version of initscripts requires lvm2-2.02.100-5 or later due to a new parameter introduced in vgchange.
- WARNING This update caused a regression described in https://access.redhat.com/solutions/2116271, and is fixed in a later version of initscripts package. See the Resolution section of solution 2116271 for further details.
Workaround
- remove the script /etc/rc.d/rc0.d/K90network for shutdown
- remove the script /etc/rc.d/rc6.d/K90network for reboot
- NOTE that the workaround works as long as system does not have custom scripts in /etc/rc.d/rc6.d and /etc/rc.d/rc0.d directories which deactivate the network.
Root Cause
- In RHEL6, the 'reboot' command seems to hang waiting for 'sync' to complete, which will never happen since the network is down.
- In RHEL7, and Fedora20 + kernel-3.16.0-rc7, this does not occur. In these other configurations, even though the network is gone and there are in-progress NFS IO in the system, a reboot command does not hang.
Diagnostic Steps
- Configure kdump and gather a vmcore at the time of the shutdown hang.
vmcore analysis
- When analysing vmcore you will see that the flush is pending and the network is unavailable.
- Looking at the network devices, none have an IP address indicating the network has been torn down
crash> net
NET_DEVICE NAME IP ADDRESS(ES)
ffff887fe7090020 lo
ffff883fe4e26020 eth0
ffff883fe48a0020 eth1
ffff883fe4f20020 eth2
ffff883fe5d21020 eth3
ffff88bfe6600020 eth4
ffff88bfe4d80020 eth5
ffff88bfe6a00020 eth6
ffff88bfe6a80020 eth7
ffff88bfe78f0020 bond0
- Look for a 'reboot' or 'halt' task hung in a backtrace waiting on
sync_filesystems
.
crash> ps | grep reboot
10479 1 25 ffff883fe2a53540 UN 0.0 14756 788 reboot
crash> bt 10479
PID: 10479 TASK: ffff883fe2a53540 CPU: 25 COMMAND: "reboot"
#0 [ffff883684c55bd8] schedule at ffffffff8150e7f2
#1 [ffff883684c55ca0] io_schedule at ffffffff8150efd3
#2 [ffff883684c55cc0] sync_page at ffffffff81119e6d
#3 [ffff883684c55cd0] __wait_on_bit at ffffffff8150f98f
#4 [ffff883684c55d20] wait_on_page_bit at ffffffff8111a0a3
#5 [ffff883684c55d80] wait_on_page_writeback_range at ffffffff8111a4cb
#6 [ffff883684c55e80] filemap_fdatawait at ffffffff8111a58f
#7 [ffff883684c55e90] sync_inodes_sb at ffffffff811ac214
#8 [ffff883684c55f20] __sync_filesystem at ffffffff811b2012
#9 [ffff883684c55f40] sync_filesystems at ffffffff811b2118
#10 [ffff883684c55f70] sys_sync at ffffffff811b21b1
#11 [ffff883684c55f80] system_call_fastpath at ffffffff8100b072
...
crash> ps | grep halt
23311 1 6 ffff880b7b6f8040 UN 0.0 14760 796 halt
crash> bt 23311
PID: 23311 TASK: ffff880b7b6f8040 CPU: 6 COMMAND: "halt"
#0 [ffff880b8ffafbd8] schedule at ffffffff81528bc0
#1 [ffff880b8ffafca0] io_schedule at ffffffff81529393
#2 [ffff880b8ffafcc0] sync_page at ffffffff8111f81d
#3 [ffff880b8ffafcd0] __wait_on_bit at ffffffff81529e5f
#4 [ffff880b8ffafd20] wait_on_page_bit at ffffffff8111fa53
#5 [ffff880b8ffafd80] wait_on_page_writeback_range at ffffffff8111fe7b
#6 [ffff880b8ffafe80] filemap_fdatawait at ffffffff8111ff3f
#7 [ffff880b8ffafe90] sync_inodes_sb at ffffffff811b5044
#8 [ffff880b8ffaff20] __sync_filesystem at ffffffff811bb6d2
#9 [ffff880b8ffaff40] sync_filesystems at ffffffff811bb7d8
#10 [ffff880b8ffaff70] sys_sync at ffffffff811bb871
- In addition, there are NFS 'not responding" messages in the log, but no 'nfs: server ... OK' messages, or at least one missing 'OK' message, indicating that one NFS operation has not received a response from the NFS server at the time of the vmcore.
crash> log | grep "nfs: server"
nfs: server nfsserver not responding, still trying
nfs: server nfsserver not responding, still trying
crash> log | grep 'nfs: server' | sort | uniq -c
2 nfs: server nfsserver not responding, still trying
crash>
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments