Cluster service containing lvm resource fails instead of relocating when volume cannot be deactivated in RHEL

Solution Verified - Updated -

Issue

  • Service with lvm resource fails to stop or recover when the file system on it cannot be unmounted:
Apr 22 17:04:37 node1 rgmanager[26011]: [fs] unmounting /dataFS
Apr 22 17:04:37 node1 rgmanager[26058]: [fs] Sending SIGTERM to processes on /dataFS
Apr 22 17:04:43 node1 rgmanager[26218]: [fs] unmounting /dataFS
Apr 22 17:04:43 node1 rgmanager[26263]: [fs] Sending SIGKILL to processes on /dataFS
Apr 22 17:04:48 node1 rgmanager[26420]: [fs] unmounting /dataFS
Apr 22 17:04:48 node1 rgmanager[26468]: [fs] Sending SIGKILL to processes on /dataFS
Apr 22 17:04:49 node1 rgmanager[13132]: stop on fs "backupdata" returned 1 (generic error)
Apr 22 17:04:51 node1 rgmanager[26525]: [lvm] Logical volume VolGroupBackup/dataFS failed to shutdown
Apr 22 17:04:51 node1 rgmanager[13132]: stop on lvm "BACKUPLVM" returned 1 (generic error)
  • Red Hat cluster does not failover when active node lost access to SAN storage
  • lvm resource failed to shutdown, causing the service to fail:
rgmanager[3010]: [lvm] Logical volume VolGroup/LogVol failed to shutdown
  • My node did not reboot itself when it could not shutdown the lvm resource, even though self_fence="on"
  • I tried to move a cluster service from node 2 to 1 with command clusvcadm -r <service> -m <node>.
    The result was failed and the service went down on node 2. To get it back working, I had to disable it a couple of times on node 2 and then re-enable it on node 2.
Jan 28 14:21:54 node2 clurgmgrd[13471]: <notice> Stopping service service:myService 
Jan 28 14:22:02 node2 clurgmgrd: [13471]: <err> Logical volume vg1/lv1 failed to shutdown 
Jan 28 14:22:02 node2 clurgmgrd[13471]: <notice> stop on lvm "vg1" returned 1 (generic error) 
Jan 28 14:22:02 node2 clurgmgrd: [13471]: <err> Logical volume vg2/lv1 failed to shutdown 
Jan 28 14:22:02 node2 clurgmgrd[13471]: <notice> stop on lvm "vg2" returned 1 (generic error) 
Jan 28 14:22:03 node2 clurgmgrd: [13471]: <err> Logical volume vg3/lv1 failed to shutdown 
Jan 28 14:22:03 node2 clurgmgrd[13471]: <notice> stop on lvm "vg3" returned 1 (generic error)
[...]
Jan 28 14:22:14 node2 clurgmgrd[13471]: <crit> #12: RG service:myService failed to stop; intervention required 
Jan 28 14:22:14 node2 clurgmgrd[13471]: <notice> Service service:myService is failed
  • Multiple clustered lvm volumes failed to be stopped when stopping the service

Environment

  • Red Hat Enterprise Linux (RHEL) 5 or 6 with the High Availability Add On
  • Using HA-LVM

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content