stonith 设备无法在带有 pacemaker 的 RHEL 高可用性集群中启动和/或报告 "Timed Out" 错误
Issue
pcs status命令显示一个或多个 stonith 设备的"Timed Out"错误。
fence_node1_start_0 on node2.example.com 'unknown error' (1): call=48, status=Timed Out, last-rc-change='Fri Sep 5 15:50:46 2014', queued=21022ms, exec=0ms
- stonith 设备监控器或启动操作会超时,并报告类似以下显示的错误。
Jun 01 11:36:07 node1.example.com crmd[2807]: notice: process_lrm_event: Operation fence_node_5356_monitor_0: not running (node=node1.example.com, call=311, rc=7, cib-upda...nfirmed=true)
Jun 01 11:36:27 node1.example.com stonith-ng[2803]: notice: stonith_action_async_done: Child process 3114 performing action 'monitor' timed out with signal 15
Jun 01 11:36:27 node1.example.com stonith-ng[2803]: notice: log_operation: Operation 'monitor' [3114] for device 'fence_node2' returned: -62 (Timer expired)
Jun 01 11:36:28 node1.example.com crmd[2807]: error: process_lrm_event: Operation fence_node_node2_start_0: Timed Out (node=node1.example.com, call=312, timeout=20000ms)
- stonith 设备处于
Stopped状态,并带有"Timed Out"错误。
Environment
- Red Hat Enterprise Linux 6、7 或 8 (通过高可用性附加组件)
- Pacemaker
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.