LVM resource with exclusive=true fails with "timed out after 30000ms" in a RHEL High Availability cluster with pacemaker

Solution Verified - Updated -

Issue

  • An LVM resource's monitor operation is frequently timing out whenever we issue a large copy to a file system
  • My LVM resource keeps failing its monitor
  • A resource group failed over to another node after the LVM resource failed
  • pacemaker HA-LVM agent monitor operation failed
Jun  4 14:17:47 node1 lrmd[6514]:  warning: child_timeout_callback: myVG_monitor_10000 process (PID 40301) timed out
Jun  4 14:17:47 node1 lrmd[6514]:  warning: operation_finished: myVG_monitor_10000:40301 - timed out after 30000ms
Jun  4 14:17:47 node1 crmd[6517]:    error: process_lrm_event: Operation myVG_monitor_10000: Timed Out (node=node1.example.com call=141, timeout=30000ms)
  • The LVM resource is created using multipath devices, but a fluctuation in few sub paths to the multipath device causes pacemaker LVM monitor time out immediately. The multipath device still had half of the sub paths active, so pacemaker LVM monitor should not have got failed or timed out.

Environment

  • Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add On
  • pacemaker
  • One or more LVM resources in the CIB with attribute exclusive=true

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.