NFSv4 server failover/reboot locks up VMs sensitive to I/O delays

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 7
  • NFSv4 server

Issue

  • A redundant NFSv4 node rebooted for maintenance blocks Input/Output (I/O) operations for 60 seconds [by default].

  • This delay is significantly above the tolerance of many operating systems for I/O operations and results in specific virtual
    machines (VMs) going into an unrecoverable softlockup state.

  • Reboot of the VMs is the only way to recover the VMs, which is unacceptable.

  • This issue occurs with certain VMs using nfsv4 protocol that are sensitive to I/O delays or disruptions.

Resolution

  1. Collaborate with storage vendor to get the appropriate resolution steps.

  2. For servers running RHEL, refer to the NFSv4 server restart causes long pause in NFS client KCS article.

Root Cause

  • Applications will only block on I/Os for 20 seconds; if the failover takes longer, the applications will fail.

Diagnostic Steps

  1. Create a 1G file on your NFS share:

    cd /var/lib/nova/mnt/50ebb5ce<LONG UUID> 9bee79 - dd if=/dev/zero of=ioping1.tmp bs=1024k count=1026
    
  2. Determine if the VM is using locks by running the command:

    lslocks
    
  3. Invoke ioping on the 1G file and capture measurements:

    ioping  -Y -D -G -WWW -S 1g -s 10m -i 0.5 -k ioping1.tmp (for VMs not using locks)
    
    flock -x -e ioping2.tmp  ioping  -Y -D -G -WWW -S 1g -s 10m -i 0.5 -k ioping2.tmp (for VMs using locks)
    
  4. Initiate ioping using a stable network to the filer. A successful execution returns a long stream output.
    Note that a stable 10g network was used for this test.

    10 MiB >>> ioping1.tmp (nfs4.example.com:/mypath): request=1   
    time=17.0 ms (warmup)
    10 MiB <<< ioping1.tmp (nfs4.example.com:/mypath): request=2 
    time=14.1 ms
    10 MiB >>> ioping1.tmp (nfs4.example.com:/mypath): request=3 
    time=16.1 ms
    10 MiB <<< ioping1.tmp (nfs4.example.com:/mypath): request=4 
    time=39.0 ms
    
  5. A node reboot (or software update), results in the following measurements:

    10 MiB <<< ioping1.tmp (nfs4.example.com:/mypath): request=2004 
    time=60.0 s (slow)
    

    and:

    [stack@tenlab2-director 03201885]$ grep -v ms ioping2.log
    10 MiB >>> ioping1.tmp (nfs4.example.com:/mypath): request=1957 
    time=59.5 s (slow)
    
  6. Breaking down the process into two steps enables VMs sensitive to I/O delays or disruptions to survive with no
    softlockup:

    10 MiB <<< ioping1.tmp (nfs4.example.com:/mypath): request=1172 
    time=6.08 s (slow)
    10 MiB <<< ioping1.tmp (nfs4.example.com:/mypath): request=1574 
    time=12.4 s (slow)
    

    and:

    10 MiB <<< ioping1.tmp (nfs4.example.com:/mypath): request=6106 
    time=2.14 s
    10 MiB >>> ioping1.tmp (nfs4.example.com:/mypath): request=6139 
    time=6.71 s
    10 MiB <<< ioping1.tmp (nfs4.example.com:/mypath): request=6302 
    time=13.4 s
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments