"ERROR: Couldn't find device [/dev/my_vg/my_lv]. Expected /dev/??? to exist" in RHEL7 cluster.

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux Server 7 (with the High Availability and Resilient Storage Add Ons)
  • Pacemaker
  • Cluster nodes as ESX guest. (Could be applicable for cluster nodes as physical host)

Issue

  • One of the cluster nodes as a guest of ESX host was unable to see the shared storage assigned to the OS and cluser filesystem resource agent was reporting the following messages
Aug 22 16:41:57 node02 Filesystem(clusterfs)[21353]: ERROR: Couldn't find device [/dev/my_vg/my_lv]. Expected /dev/??? to exist
Aug 22 16:43:25 node02 Filesystem(clusterfs)[21633]: ERROR: Couldn't find device [/dev/my_vg/my_lv]. Expected /dev/??? to exist
Aug 22 16:51:50 node02 Filesystem(clusterfs)[2789]: ERROR: Couldn't find device [/dev/my_vg/my_lv]. Expected /dev/??? to exist

Aug 25 14:43:38 node02 Filesystem(clusterfs)[2761]: WARNING: Couldn't find device [/dev/my_vg/my_lv]. Expected /dev/??? to exist
Aug 25 14:43:38 node02 crmd[1907]: notice: process_lrm_event: LRM operation clvmd_monitor_30000 (call=32, rc=0, cib-update=23, confirmed=false) ok
Aug 25 14:43:38 node02 Filesystem(clusterfs)[2761]: INFO: Running stop for /dev/my_vg/my_lv on /mnt/my_fs
Aug 25 14:43:38 node02 lrmd[1904]: notice: operation_finished: clusterfs_stop_0:2761:stderr [ blockdev: cannot open /dev/my_vg/my_lv: No such file or directory ]
Aug 25 14:43:38 node02 crmd[1907]: notice: process_lrm_event: LRM operation clusterfs_stop_0 (call=33, rc=0, cib-update=24, confirmed=true) ok

Resolution

  • The dm device was available in one of the nodes of the cluster and the other node was reporting the mentioned messages. That is the shared storage devices were not shared correctly between the cluster nodes. For cluster nodes as VMWare guest, please follow

1. Set the VMWare SCSI Controllers to "Physical" SCSI Bus Sharing to share a VMDK file between the cluster nodes.
2. Convert the VMWare disks to the type, "eagerzeroedthick". Make the appropriate changes before starting the cluster nodes. This is easiest done by turning on the VM fault tolerance, then turning it back off.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

1 Comments

Two more things you should check:

  1. Make sure that your that the "mycluster" name tag below:
    mkfs.gfs2 -p lock_dlm -t mycluster:gfs2 -j 2 /dev/mapper/clusterVG-clusterLV
    MUST match the cluster name in cluster.conf or cib.xml

  2. You have correctly setup constraints, e.g.

pcs constraint order start dlm-clone then clvmd-clone

pcs constraint colocation add clvmd-clone with dlm-clone

pcs constraint order start clvmd-clone then fs_gfs2-clone

pcs constraint colocation add fs_gfs2-clone with clvmd-clone