Red Hat Training

A Red Hat training course is available for RHEL 8

Chapter 6. Configuring an active/passive NFS server in a Red Hat High Availability cluster

The Red Hat High Availability Add-On provides support for running a highly available active/passive NFS server on a Red Hat Enterprise Linux High Availability Add-On cluster using shared storage. In the following example, you are configuring a two-node cluster in which clients access the NFS file system through a floating IP address. The NFS server runs on one of the two nodes in the cluster. If the node on which the NFS server is running becomes inoperative, the NFS server starts up again on the second node of the cluster with minimal service interruption.

This use case requires that your system include the following components:

  • A two-node Red Hat High Availability cluster with power fencing configured for each node. We recommend but do not require a private network. This procedure uses the cluster example provided in Creating a Red Hat High-Availability cluster with Pacemaker.
  • A public virtual IP address, required for the NFS server.
  • Shared storage for the nodes in the cluster, using iSCSI, Fibre Channel, or other shared network block device.

Configuring a highly available active/passive NFS server on an existing two-node Red Hat Enterprise Linux High Availability cluster requires that you perform the following steps:

  1. Configure a file system on an LVM logical volume on the shared storage for the nodes in the cluster.
  2. Configure an NFS share on the shared storage on the LVM logical volume.
  3. Create the cluster resources.
  4. Test the NFS server you have configured.

6.1. Configuring an LVM volume with an XFS file system in a Pacemaker cluster

Create an LVM logical volume on storage that is shared between the nodes of the cluster with the following procedure.

Note

LVM volumes and the corresponding partitions and devices used by cluster nodes must be connected to the cluster nodes only.

The following procedure creates an LVM logical volume and then creates an XFS file system on that volume for use in a Pacemaker cluster. In this example, the shared partition /dev/sdb1 is used to store the LVM physical volume from which the LVM logical volume will be created.

Procedure

  1. On both nodes of the cluster, perform the following steps to set the value for the LVM system ID to the value of the uname identifier for the system. The LVM system ID will be used to ensure that only the cluster is capable of activating the volume group.

    1. Set the system_id_source configuration option in the /etc/lvm/lvm.conf configuration file to uname.

      # Configuration option global/system_id_source.
      system_id_source = "uname"
    2. Verify that the LVM system ID on the node matches the uname for the node.

      # lvm systemid
        system ID: z1.example.com
      # uname -n
        z1.example.com
  2. Create the LVM volume and create an XFS file system on that volume. Since the /dev/sdb1 partition is storage that is shared, you perform this part of the procedure on one node only.

    Note

    If your LVM volume group contains one or more physical volumes that reside on remote block storage, such as an iSCSI target, Red Hat recommends that you ensure that the service starts before Pacemaker starts. For information about configuring startup order for a remote physical volume used by a Pacemaker cluster, see Configuring startup order for resource dependencies not managed by Pacemaker.

    1. Create an LVM physical volume on partition /dev/sdb1.

      [root@z1 ~]# pvcreate /dev/sdb1
        Physical volume "/dev/sdb1" successfully created
      Note

      If your LVM volume group contains one or more physical volumes that reside on remote block storage, such as an iSCSI target, Red Hat recommends that you ensure that the service starts before Pacemaker starts. For information about configuring startup order for a remote physical volume used by a Pacemaker cluster, see Configuring startup order for resource dependencies not managed by Pacemaker.

    2. Create the volume group my_vg that consists of the physical volume /dev/sdb1.

      For RHEL 8.5 and later, specify the --setautoactivation n flag to ensure that volume groups managed by Pacemaker in a cluster will not be automatically activated on startup. If you are using an existing volume group for the LVM volume you are creating, you can reset this flag with the vgchange --setautoactivation n command for the volume group.

      [root@z1 ~]# vgcreate --setautoactivation n my_vg /dev/sdb1
        Volume group "my_vg" successfully created

      For RHEL 8.4 and earlier, create the volume group with the following command.

      [root@z1 ~]# vgcreate my_vg /dev/sdb1
        Volume group "my_vg" successfully created

      For information about ensuring that volume groups managed by Pacemaker in a cluster will not be automatically activated on startup for RHEL 8.4 and earlier, see Ensuring a volume group is not activated on multiple cluster nodes.

    3. Verify that the new volume group has the system ID of the node on which you are running and from which you created the volume group.

      [root@z1 ~]# vgs -o+systemid
        VG    #PV #LV #SN Attr   VSize  VFree  System ID
        my_vg   1   0   0 wz--n- <1.82t <1.82t z1.example.com
    4. Create a logical volume using the volume group my_vg.

      [root@z1 ~]# lvcreate -L450 -n my_lv my_vg
        Rounding up size to full physical extent 452.00 MiB
        Logical volume "my_lv" created

      You can use the lvs command to display the logical volume.

      [root@z1 ~]# lvs
        LV      VG      Attr      LSize   Pool Origin Data%  Move Log Copy%  Convert
        my_lv   my_vg   -wi-a---- 452.00m
        ...
    5. Create an XFS file system on the logical volume my_lv.

      [root@z1 ~]# mkfs.xfs /dev/my_vg/my_lv
      meta-data=/dev/my_vg/my_lv       isize=512    agcount=4, agsize=28928 blks
               =                       sectsz=512   attr=2, projid32bit=1
      ...
  3. (RHEL 8.5 and later) If you have enabled the use of a devices file by setting use_devicesfile = 1 in the lvm.conf file, add the shared device to the devices file on the second node in the cluster. By default, the use of a devices file is not enabled.

    [root@z2 ~]# lvmdevices --adddev /dev/sdb1

6.2. Ensuring a volume group is not activated on multiple cluster nodes (RHEL 8.4 and earlier)

You can ensure that volume groups that are managed by Pacemaker in a cluster will not be automatically activated on startup with the following procedure. If a volume group is automatically activated on startup rather than by Pacemaker, there is a risk that the volume group will be active on multiple nodes at the same time, which could corrupt the volume group’s metadata.

Note

For RHEL 8.5 and later, you can disable autoactivation for a volume group when you create the volume group by specifying the --setautoactivation n flag for the vgcreate command, as described in Configuring an LVM volume with an XFS file system in a Pacemaker cluster.

This procedure modifies the auto_activation_volume_list entry in the /etc/lvm/lvm.conf configuration file. The auto_activation_volume_list entry is used to limit autoactivation to specific logical volumes. Setting auto_activation_volume_list to an empty list disables autoactivation entirely.

Any local volumes that are not shared and are not managed by Pacemaker should be included in the auto_activation_volume_list entry, including volume groups related to the node’s local root and home directories. All volume groups managed by the cluster manager must be excluded from the auto_activation_volume_list entry.

Procedure

Perform the following procedure on each node in the cluster.

  1. Determine which volume groups are currently configured on your local storage with the following command. This will output a list of the currently-configured volume groups. If you have space allocated in separate volume groups for root and for your home directory on this node, you will see those volumes in the output, as in this example.

    # vgs --noheadings -o vg_name
      my_vg
      rhel_home
      rhel_root
  2. Add the volume groups other than my_vg (the volume group you have just defined for the cluster) as entries to auto_activation_volume_list in the /etc/lvm/lvm.conf configuration file.

    For example, if you have space allocated in separate volume groups for root and for your home directory, you would uncomment the auto_activation_volume_list line of the lvm.conf file and add these volume groups as entries to auto_activation_volume_list as follows. Note that the volume group you have just defined for the cluster (my_vg in this example) is not in this list.

    auto_activation_volume_list = [ "rhel_root", "rhel_home" ]
    Note

    If no local volume groups are present on a node to be activated outside of the cluster manager, you must still initialize the auto_activation_volume_list entry as auto_activation_volume_list = [].

  3. Rebuild the initramfs boot image to guarantee that the boot image will not try to activate a volume group controlled by the cluster. Update the initramfs device with the following command. This command may take up to a minute to complete.

    # dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
  4. Reboot the node.

    Note

    If you have installed a new Linux kernel since booting the node on which you created the boot image, the new initrd image will be for the kernel that was running when you created it and not for the new kernel that is running when you reboot the node. You can ensure that the correct initrd device is in use by running the uname -r command before and after the reboot to determine the kernel release that is running. If the releases are not the same, update the initrd file after rebooting with the new kernel and then reboot the node.

  5. When the node has rebooted, check whether the cluster services have started up again on that node by executing the pcs cluster status command on that node. If this yields the message Error: cluster is not currently running on this node then enter the following command.

    # pcs cluster start

    Alternately, you can wait until you have rebooted each node in the cluster and start cluster services on all of the nodes in the cluster with the following command.

    # pcs cluster start --all

6.3. Configuring an NFS share

Configure an NFS share for an NFS service failover with the following procedure.

Procedure

  1. On both nodes in the cluster, create the /nfsshare directory.

    # mkdir /nfsshare
  2. On one node in the cluster, perform the following procedure.

    1. Ensure that the logical volume you you created in Configuring an LVM volume with an XFS file system is activated, then mount the file system you created on the logical volume on the /nfsshare directory.

      [root@z1 ~]# lvchange -ay my_vg/my_lv
      [root@z1 ~]# mount /dev/my_vg/my_lv /nfsshare
    2. Create an exports directory tree on the /nfsshare directory.

      [root@z1 ~]# mkdir -p /nfsshare/exports
      [root@z1 ~]# mkdir -p /nfsshare/exports/export1
      [root@z1 ~]# mkdir -p /nfsshare/exports/export2
    3. Place files in the exports directory for the NFS clients to access. For this example, we are creating test files named clientdatafile1 and clientdatafile2.

      [root@z1 ~]# touch /nfsshare/exports/export1/clientdatafile1
      [root@z1 ~]# touch /nfsshare/exports/export2/clientdatafile2
    4. Unmount the file system and deactivate the LVM volume group.

      [root@z1 ~]# umount /dev/my_vg/my_lv
      [root@z1 ~]# vgchange -an my_vg

6.4. Configuring the resources and resource group for an NFS server in a cluster

Configure the cluster resources for an NFS server in a cluster with the following procedure.

Note

If you have not configured a fencing device for your cluster, by default the resources do not start.

If you find that the resources you configured are not running, you can run the pcs resource debug-start resource command to test the resource configuration. This starts the service outside of the cluster’s control and knowledge. At the point the configured resources are running again, run pcs resource cleanup resource to make the cluster aware of the updates.

Procedure

The following procedure configures the system resources. To ensure these resources all run on the same node, they are configured as part of the resource group nfsgroup. The resources will start in the order in which you add them to the group, and they will stop in the reverse order in which they are added to the group. Run this procedure from one node of the cluster only.

  1. Create the LVM-activate resource named my_lvm. Because the resource group nfsgroup does not yet exist, this command creates the resource group.

    Warning

    Do not configure more than one LVM-activate resource that uses the same LVM volume group in an active/passive HA configuration, as this risks data corruption. Additionally, do not configure an LVM-activate resource as a clone resource in an active/passive HA configuration.

    [root@z1 ~]# pcs resource create my_lvm ocf:heartbeat:LVM-activate vgname=my_vg vg_access_mode=system_id --group nfsgroup
  2. Check the status of the cluster to verify that the resource is running.

    root@z1 ~]#  pcs status
    Cluster name: my_cluster
    Last updated: Thu Jan  8 11:13:17 2015
    Last change: Thu Jan  8 11:13:08 2015
    Stack: corosync
    Current DC: z2.example.com (2) - partition with quorum
    Version: 1.1.12-a14efad
    2 Nodes configured
    3 Resources configured
    
    Online: [ z1.example.com z2.example.com ]
    
    Full list of resources:
     myapc  (stonith:fence_apc_snmp):       Started z1.example.com
     Resource Group: nfsgroup
         my_lvm     (ocf::heartbeat:LVM-activate):   Started z1.example.com
    
    PCSD Status:
      z1.example.com: Online
      z2.example.com: Online
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  3. Configure a Filesystem resource for the cluster.

    The following command configures an XFS Filesystem resource named nfsshare as part of the nfsgroup resource group. This file system uses the LVM volume group and XFS file system you created in Configuring an LVM volume with an XFS file system and will be mounted on the /nfsshare directory you created in Configuring an NFS share.

    [root@z1 ~]# pcs resource create nfsshare Filesystem device=/dev/my_vg/my_lv directory=/nfsshare fstype=xfs --group nfsgroup

    You can specify mount options as part of the resource configuration for a Filesystem resource with the options=options parameter. Run the pcs resource describe Filesystem command for full configuration options.

  4. Verify that the my_lvm and nfsshare resources are running.

    [root@z1 ~]# pcs status
    ...
    Full list of resources:
     myapc  (stonith:fence_apc_snmp):       Started z1.example.com
     Resource Group: nfsgroup
         my_lvm     (ocf::heartbeat:LVM-activate):   Started z1.example.com
         nfsshare   (ocf::heartbeat:Filesystem):    Started z1.example.com
    ...
  5. Create the nfsserver resource named nfs-daemon as part of the resource group nfsgroup.

    Note

    The nfsserver resource allows you to specify an nfs_shared_infodir parameter, which is a directory that NFS servers use to store NFS-related stateful information.

    It is recommended that this attribute be set to a subdirectory of one of the Filesystem resources you created in this collection of exports. This ensures that the NFS servers are storing their stateful information on a device that will become available to another node if this resource group needs to relocate. In this example;

    • /nfsshare is the shared-storage directory managed by the Filesystem resource
    • /nfsshare/exports/export1 and /nfsshare/exports/export2 are the export directories
    • /nfsshare/nfsinfo is the shared-information directory for the nfsserver resource
    [root@z1 ~]# pcs resource create nfs-daemon nfsserver nfs_shared_infodir=/nfsshare/nfsinfo nfs_no_notify=true --group nfsgroup
    
    [root@z1 ~]# pcs status
    ...
  6. Add the exportfs resources to export the /nfsshare/exports directory. These resources are part of the resource group nfsgroup. This builds a virtual directory for NFSv4 clients. NFSv3 clients can access these exports as well.

    Note

    The fsid=0 option is required only if you want to create a virtual directory for NFSv4 clients. For more information, see How do I configure the fsid option in an NFS server’s /etc/exports file?. //

    [root@z1 ~]# pcs resource create nfs-root exportfs clientspec=192.168.122.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports fsid=0 --group nfsgroup
    
    [root@z1 ~]# pcs resource create nfs-export1 exportfs clientspec=192.168.122.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports/export1 fsid=1 --group nfsgroup
    
    [root@z1 ~]# pcs resource create nfs-export2 exportfs clientspec=192.168.122.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports/export2 fsid=2 --group nfsgroup
  7. Add the floating IP address resource that NFS clients will use to access the NFS share. This resource is part of the resource group nfsgroup. For this example deployment, we are using 192.168.122.200 as the floating IP address.

    [root@z1 ~]# pcs resource create nfs_ip IPaddr2 ip=192.168.122.200 cidr_netmask=24 --group nfsgroup
  8. Add an nfsnotify resource for sending NFSv3 reboot notifications once the entire NFS deployment has initialized. This resource is part of the resource group nfsgroup.

    Note

    For the NFS notification to be processed correctly, the floating IP address must have a host name associated with it that is consistent on both the NFS servers and the NFS client.

    [root@z1 ~]# pcs resource create nfs-notify nfsnotify source_host=192.168.122.200 --group nfsgroup
  9. After creating the resources and the resource constraints, you can check the status of the cluster. Note that all resources are running on the same node.

    [root@z1 ~]# pcs status
    ...
    Full list of resources:
     myapc  (stonith:fence_apc_snmp):       Started z1.example.com
     Resource Group: nfsgroup
         my_lvm     (ocf::heartbeat:LVM-activate):   Started z1.example.com
         nfsshare   (ocf::heartbeat:Filesystem):    Started z1.example.com
         nfs-daemon (ocf::heartbeat:nfsserver):     Started z1.example.com
         nfs-root   (ocf::heartbeat:exportfs):      Started z1.example.com
         nfs-export1        (ocf::heartbeat:exportfs):      Started z1.example.com
         nfs-export2        (ocf::heartbeat:exportfs):      Started z1.example.com
         nfs_ip     (ocf::heartbeat:IPaddr2):       Started  z1.example.com
         nfs-notify (ocf::heartbeat:nfsnotify):     Started z1.example.com
    ...

6.5. Testing the NFS resource configuration

You can validate your NFS resource configuration in a high availability cluster with the following procedures. You should be able to mount the exported file system with either NFSv3 or NFSv4.

6.5.1. Testing the NFS export

  1. If you are running the firewalld daemon on your cluster nodes, ensure that the ports that your system requires for NFS access are enabled on all nodes.
  2. On a node outside of the cluster, residing in the same network as the deployment, verify that the NFS share can be seen by mounting the NFS share. For this example, we are using the 192.168.122.0/24 network.

    # showmount -e 192.168.122.200
    Export list for 192.168.122.200:
    /nfsshare/exports/export1 192.168.122.0/255.255.255.0
    /nfsshare/exports         192.168.122.0/255.255.255.0
    /nfsshare/exports/export2 192.168.122.0/255.255.255.0
  3. To verify that you can mount the NFS share with NFSv4, mount the NFS share to a directory on the client node. After mounting, verify that the contents of the export directories are visible. Unmount the share after testing.

    # mkdir nfsshare
    # mount -o "vers=4" 192.168.122.200:export1 nfsshare
    # ls nfsshare
    clientdatafile1
    # umount nfsshare
  4. Verify that you can mount the NFS share with NFSv3. After mounting, verify that the test file clientdatafile1 is visible. Unlike NFSv4, since NFSv3 does not use the virtual file system, you must mount a specific export. Unmount the share after testing.

    # mkdir nfsshare
    # mount -o "vers=3" 192.168.122.200:/nfsshare/exports/export2 nfsshare
    # ls nfsshare
    clientdatafile2
    # umount nfsshare

6.5.2. Testing for failover

  1. On a node outside of the cluster, mount the NFS share and verify access to the clientdatafile1 file you created in Configuring an NFS share.

    # mkdir nfsshare
    # mount -o "vers=4" 192.168.122.200:export1 nfsshare
    # ls nfsshare
    clientdatafile1
  2. From a node within the cluster, determine which node in the cluster is running nfsgroup. In this example, nfsgroup is running on z1.example.com.

    [root@z1 ~]# pcs status
    ...
    Full list of resources:
     myapc  (stonith:fence_apc_snmp):       Started z1.example.com
     Resource Group: nfsgroup
         my_lvm     (ocf::heartbeat:LVM-activate):   Started z1.example.com
         nfsshare   (ocf::heartbeat:Filesystem):    Started z1.example.com
         nfs-daemon (ocf::heartbeat:nfsserver):     Started z1.example.com
         nfs-root   (ocf::heartbeat:exportfs):      Started z1.example.com
         nfs-export1        (ocf::heartbeat:exportfs):      Started z1.example.com
         nfs-export2        (ocf::heartbeat:exportfs):      Started z1.example.com
         nfs_ip     (ocf::heartbeat:IPaddr2):       Started  z1.example.com
         nfs-notify (ocf::heartbeat:nfsnotify):     Started z1.example.com
    ...
  3. From a node within the cluster, put the node that is running nfsgroup in standby mode.

    [root@z1 ~]# pcs node standby z1.example.com
  4. Verify that nfsgroup successfully starts on the other cluster node.

    [root@z1 ~]# pcs status
    ...
    Full list of resources:
     Resource Group: nfsgroup
         my_lvm     (ocf::heartbeat:LVM-activate):   Started z2.example.com
         nfsshare   (ocf::heartbeat:Filesystem):    Started z2.example.com
         nfs-daemon (ocf::heartbeat:nfsserver):     Started z2.example.com
         nfs-root   (ocf::heartbeat:exportfs):      Started z2.example.com
         nfs-export1        (ocf::heartbeat:exportfs):      Started z2.example.com
         nfs-export2        (ocf::heartbeat:exportfs):      Started z2.example.com
         nfs_ip     (ocf::heartbeat:IPaddr2):       Started  z2.example.com
         nfs-notify (ocf::heartbeat:nfsnotify):     Started z2.example.com
    ...
  5. From the node outside the cluster on which you have mounted the NFS share, verify that this outside node still continues to have access to the test file within the NFS mount.

    # ls nfsshare
    clientdatafile1

    Service will be lost briefly for the client during the failover but the client should recover it with no user intervention. By default, clients using NFSv4 may take up to 90 seconds to recover the mount; this 90 seconds represents the NFSv4 file lease grace period observed by the server on startup. NFSv3 clients should recover access to the mount in a matter of a few seconds.

  6. From a node within the cluster, remove the node that was initially running nfsgroup from standby mode.

    Note

    Removing a node from standby mode does not in itself cause the resources to fail back over to that node. This will depend on the resource-stickiness value for the resources. For information about the resource-stickiness meta attribute, see Configuring a resource to prefer its current node.

    [root@z1 ~]# pcs node unstandby z1.example.com