RHEL 6 High Availability クラスターの 1 つのノードで corosync サービスがランダムに停止する
Issue
- corosync サービスが 1 つのノードでランダムに停止します。
- 突然、
cmanおよびpacemakerからのクラスター関連のすべてのプロセスが、クラスターとcorosyncが停止したことをレポートしますが、corosyncはログに何も出力せず、単に実行されなくなりました。
Jul 8 18:00:21 node1 fenced[39630]: cluster is down, exiting
Jul 8 18:00:21 node1 fenced[39630]: daemon cpg_dispatch error 2
Jul 8 18:00:21 node1 dlm_controld[39654]: cluster is down, exiting
Jul 8 18:00:21 node1 attrd[39839]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
Jul 8 18:00:21 node1 gfs_controld[39701]: cluster is down, exiting
Jul 8 18:00:21 node1 pacemakerd[39830]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
Jul 8 18:00:21 node1 cib[39836]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
Jul 8 18:00:21 node1 crmd[39841]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
Jul 8 18:00:21 node1 stonith-ng[39837]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
Jul 8 18:00:21 node1 gfs_controld[39701]: daemon cpg_dispatch error 2
Jul 8 18:00:21 node1 dlm_controld[39654]: daemon cpg_dispatch error 2
Jul 8 18:00:21 node1 attrd[39839]: crit: attrd_cs_destroy: Lost connection to Corosync service!
Jul 8 18:00:21 node1 stonith-ng[39837]: error: stonith_peer_cs_destroy: Corosync connection terminated
Jul 8 18:00:21 node1 pacemakerd[39830]: error: mcp_cpg_destroy: Connection destroyed
Jul 8 18:00:21 node1 cib[39836]: error: cib_cs_destroy: Corosync connection lost! Exiting.
Jul 8 18:00:21 node1 crmd[39841]: error: crmd_cs_destroy: connection terminated
Jul 8 18:00:21 node1 attrd[39839]: error: attrd_cib_connection_destroy: Connection to the CIB terminated...
Jul 8 18:00:23 node1 kernel: dlm: closing connection to node 2
Jul 8 18:00:23 node1 kernel: dlm: closing connection to node 4
Jul 8 18:00:23 node1 kernel: dlm: closing connection to node 3
Jul 8 18:00:23 node1 kernel: dlm: closing connection to node 6
Jul 8 18:00:23 node1 kernel: dlm: closing connection to node 5
Jul 8 18:00:23 node1 kernel: dlm: closing connection to node 1
Environment
Red Hat Enterprise Linux (RHEL) 6 (High Availability Add-On 使用)
- corosync
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.