A stonith device is failing to start and/or reporting "Timed Out" errors in a RHEL 6 or 7 High Availability cluster with pacemaker
Issue
pcs statusshows errors for mystonithdevices that say it "Timed Out"
fence_node1_start_0 on node2.example.com 'unknown error' (1): call=48, status=Timed Out, last-rc-change='Fri Sep 5 15:50:46 2014', queued=21022ms, exec=0ms
stonithdevices are timing out in my cluster and reporting errors
Jun 01 11:36:07 node1.example.com crmd[2807]: notice: process_lrm_event: Operation fence_node_5356_monitor_0: not running (node=node1.example.com, call=311, rc=7, cib-upda...nfirmed=true)
Jun 01 11:36:27 node1.example.com stonith-ng[2803]: notice: stonith_action_async_done: Child process 3114 performing action 'monitor' timed out with signal 15
Jun 01 11:36:27 node1.example.com stonith-ng[2803]: notice: log_operation: Operation 'monitor' [3114] for device 'fence_node2' returned: -62 (Timer expired)
Jun 01 11:36:28 node1.example.com crmd[2807]: error: process_lrm_event: Operation fence_node_node2_start_0: Timed Out (node=node1.example.com, call=312, timeout=20000ms)
- My
stonithdevices won't start, and just stay in a "Stopped" state. I see errors reporting it "Timed Out".
Environment
- Red Hat Enterprise Linux (RHEL) 6 or 7 with the High Availability Add On
pacemaker
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
