One or more resources times out starting when the cluster is recovering various resources following a node event in a RHEL 6 or 7 High Availability cluster
Issue
- A node got fenced, causing another node to take over its resources. During that takeover, while a bunch of LVM resources were starting, one of those LVM resources timed out
Apr 19 01:09:45 node2 Filesystem(myVG)[106610]: INFO: Running start for /dev/myvg/lv1 on /lv1
Apr 19 01:09:45 node2 LVM(myVG)[106099]: INFO: 4 logical volume(s) in volume group "myVG" now active
Apr 19 01:09:48 node2 lrmd[12512]: warning: child_timeout_callback: myVG_start_0 process (PID 106099) timed out
-
When my cluster is trying to start several resources at once, they all start more slowly, sometimes causing timeouts. Is there anything I can do to space them out?
-
I see resources timeout on start when the cluster moves several of them to a node at the same time.
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add-On
pacemaker- Multiple resources that could move all at once to a particular node, but are not ordered or grouped relative to each other. In other words: the cluster is allowed to start them in parallel.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.