LVM resource with exclusive=true fails with "timed out after 30000ms" in a RHEL High Availability cluster with pacemaker
Issue
- An
LVMresource's monitor operation is frequently timing out whenever we issue a large copy to a file system - My
LVMresource keeps failing its monitor - A resource group failed over to another node after the
LVMresource failed -
pacemakerHA-LVM agent monitor operation failedJun 4 14:17:47 node1 lrmd[6514]: warning: child_timeout_callback: myVG_monitor_10000 process (PID 40301) timed out Jun 4 14:17:47 node1 lrmd[6514]: warning: operation_finished: myVG_monitor_10000:40301 - timed out after 30000ms Jun 4 14:17:47 node1 crmd[6517]: error: process_lrm_event: Operation myVG_monitor_10000: Timed Out (node=node1.example.com call=141, timeout=30000ms) -
The
LVMresource is created using multipath devices, but a fluctuation in few sub paths to the multipath device causes pacemaker LVM monitor time out immediately. The multipath device still had half of the sub paths active, so pacemaker LVM monitor should not have got failed or timed out.
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add On
pacemaker- One or more
LVMresources in the CIB with attributeexclusive=true
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.