One or more resources times out starting when the cluster is recovering various resources following a node event in a RHEL 6 or 7 High Availability cluster
Issue
- A node got fenced, causing another node to take over its resources. During that takeover, while a bunch of LVM resources were starting, one of those LVM resources timed out
Apr 19 01:09:45 node2 Filesystem(myVG)[106610]: INFO: Running start for /dev/myvg/lv1 on /lv1
Apr 19 01:09:45 node2 LVM(myVG)[106099]: INFO: 4 logical volume(s) in volume group "myVG" now active
Apr 19 01:09:48 node2 lrmd[12512]: warning: child_timeout_callback: myVG_start_0 process (PID 106099) timed out
-
When my cluster is trying to start several resources at once, they all start more slowly, sometimes causing timeouts. Is there anything I can do to space them out?
-
I see resources timeout on start when the cluster moves several of them to a node at the same time.
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add-On
pacemaker- Multiple resources that could move all at once to a particular node, but are not ordered or grouped relative to each other. In other words: the cluster is allowed to start them in parallel.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
