LVM resource with exclusive=true fails with "timed out after 30000ms" in a RHEL High Availability cluster with pacemaker

Solution Verified - Updated -

Issue

  • An LVM resource's monitor operation is frequently timing out whenever we issue a large copy to a file system
  • My LVM resource keeps failing its monitor
  • A resource group failed over to another node after the LVM resource failed
  • pacemaker HA-LVM agent monitor operation failed

    Jun  4 14:17:47 node1 lrmd[6514]:  warning: child_timeout_callback: myVG_monitor_10000 process (PID 40301) timed out
    Jun  4 14:17:47 node1 lrmd[6514]:  warning: operation_finished: myVG_monitor_10000:40301 - timed out after 30000ms
    Jun  4 14:17:47 node1 crmd[6517]:    error: process_lrm_event: Operation myVG_monitor_10000: Timed Out (node=node1.example.com call=141, timeout=30000ms)
    
  • The LVM resource is created using multipath devices, but a fluctuation in few sub paths to the multipath device causes pacemaker LVM monitor time out immediately. The multipath device still had half of the sub paths active, so pacemaker LVM monitor should not have got failed or timed out.

Environment

  • Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add On
  • pacemaker
  • One or more LVM resources in the CIB with attribute exclusive=true

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content