Boot processing of systemd running on the instance stopped network-online.target reach if /etc/fstab contains NFS entry.

Solution In Progress - Updated -

Environment

  • An instance is on Azure, and the OS of the instance is Red Hat Enterprise Linux 8 and the later
  • cloud-init.service is active
  • /etc/fstab file contains NFSv3 mount entry or /etc/fstab file contains NFSv4 mount entry which is falling back and trying to mount NFSv3

Issue

  • Instances which cloud-init is enabled will not boot properly. It hangs during boot.

Resolution

Workaround

There are 2 patterns.

  • Pattern 1:
    Specify NFSv4 mount option for an NFS entry in /etc/fstab

    1. unmount the NFS
    # umount <MOUNTPOINT>
    
    1. Append NFSv4 option in /etc/fstab
      Append vers=4 or nfsvers=4 mount option for a targeted entry in /etc/fstab

    2. Mount NFS

    # mount -a 
    
  • Pattern 2:
    Append noauto mount option for an NFSv3 entry in /etc/fstab and creating custom modules using DropIn

    1. Append noauto mount option for an NFSv3 entry in /etc/fstab

    2. Confirm the unit file name (<UNIT_NAME>) corresponding to the mount point.
      <UNIT_NAME> is named xxx.mount.

    # systemd-escape -p --suffix=mount <MOUNTPOINT>
    <UNIT_NAME>
    
    1. Create the directory /etc/systemd/system/remote-fs.target.d/ if it does not exist.
    # mkdir /etc/systemd/system/remote-fs.target.d/
    
    1. Create /etc/systemd/system/remote-fs.target.d/50-Requires.conf with the following contents.
    (<UNIT_NAME> is the name (xxx.mount) from step 2)
    
    [Unit]
    Requires=<UNIT_NAME>
    
    1. Execute systemctl daemon-reload to apply the configuration to systemd.
    # systemctl daemon-reload
    
    1. Verify the settings are applied.
    # systemctl show -p DropInPaths,Requires remote-fs.target
     Requires=<UNIT_NAME>
     DropInPaths=/etc/systemd/system/remote-fs.target.d/50-Requires.conf
    

    If Requires and DoropInPaths appear in the output above, they are set correctly.

    1. Reboot the instance and then verify the NFSv3 entry is automatically mounted.

Root Cause

  • If the cloud-init is enabled in the VM on Azure, rpc-statd-notify.service and rpc-statd.service starts before the network gets ready.
    Therefore, rpc-statd-notify.service and rpc-statd.service stucks while enabling NFSv3, and the system boot stops.
  • NFSv3 requires rpc-statd-notify.service and rpc-statd.service, so the issue happens on NFSv3.
  • NFSv4 doesn't require them, however, if the user configs NFSv4 to fall back to NFSv3, this issue may happen on NFSv4 as well.

Diagnostic Steps

  1. See /usr/lib/systemd/system/cloud-init.service on the instance to confirm Before=network-online.target is set.

    [Unit]
    Description=Initial cloud-init job (metadata service crawler)
    ..snip..
    Before=network-online.target
    ..snip..
    
  2. See /etc/cloud/cloud.cfg on instance to confirm cloud-init.service executes cloud-init modules.
    For Azure VM, cloud-init mounts module runs via cloud-init.service, according to the requirements in Microsoft's documentation.

    ..snip..
    cloud_init_modules:
     - disk_setup
     - mounts
    ..snip..
    
  3. See /usr/lib/systemd/system/rpc-statd.service on the instance to confirm After=network-online.target is set.

    [Unit]
    Description=NFS status monitor for NFSv2/3 locking. //this description is wrong NFSv4 is right
    DefaultDependencies=no
    Conflicts=umount.target
    Requires=nss-lookup.target rpcbind.socket
    Wants=network-online.target
    After=network-online.target nss-lookup.target rpcbind.socket
    ..snip..
    
  4. See /etc/fstab on instance to confirm NFSv4 or noauto mount option for an NFS entry is not specified .

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments