The monitor and stop operations of an ethmonitor resource timed out in a Pacemaker cluster

Solution Verified - Updated -

Issue

  • Why did the ethmonitor resource in my cluster time out, causing the node to be rebooted?
  • ethmonitor resource timed out with no messages indicating "link down" or other network issues.
Dec 18 02:49:54 node-2 lrmd[71867]:  warning: child_timeout_callback: bond0-monitor_monitor_60000 process (PID 8970) timed out
Dec 18 02:49:54 node-2 lrmd[71867]:  warning: operation_finished: bond0-monitor_monitor_60000:8970 - timed out after 60000ms
...
Dec 18 02:49:54 node-2 crmd[71870]:   notice: te_rsc_command: Initiating action 5: stop bond0-monitor_stop_0 on node-2 (local)
Dec 18 02:50:14 node-2 lrmd[71867]:  warning: child_timeout_callback: bond0-monitor_stop_0 process (PID 10133) timed out
Dec 18 02:50:14 node-2 lrmd[71867]:  warning: operation_finished: bond0-monitor_stop_0:10133 - timed out after 20000ms

Environment

  • Red Hat Enterprise Linux 6, 7, or 8 (with the High Availability Add-on)
  • Pacemaker
  • An ocf:heartbeat:ethmonitor resource

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In