How to deal with active orphan resources

Posted on

Hi,
In a two-node cluster, the master is on node0. After node1 is shut down, cluster resources are added to the cluster. When node1 is turned on, the resources just added on node0 are deleted. How to avoid this problem? How to make cluster resources run normally when a new node joins?

In the corosync.log, we can see the main information as follows(xx_pool_H is what we call cluster resources):

Mar 29 11:57:57 [4875] xx pengine: info: rsc_action_digest_cmp: Parameters to xx_pool_H_monitor_0 on node0 changed: was 8b36ac1527c6ce34f76bb21fb642d3cb vs. now f2317cad3d54cec5d7d7aa7d0bf35cf8 (reload:3.0.14) 0:0;8:3:7:c9bb2393-efb3-4248-a9a3-cc175b05e92a

Mar 29 11:57:57 [4875] xx pengine: notice: pe__clear_failcount: Clearing failure of xx_pool_H on node0 because resource parameters have changed | xx_pool_H_clear_failcount_0

Mar 29 11:57:57 xx pengine[4875]: notice: Clearing failure of xx_pool_H on node0 because resource parameters have changed

Mar 29 11:57:57 xx pengine[4875]: warning: Detected active orphan xx_pool_H running on node0

In order to solve this problem, we try to set the stop-orphan-resources attribute to false in the cluster configuration. When the shutdown node1 joins the cluster, the newly added resource XX_pool_H will not be removed, but the resource is started in isolation on node0. The cluster status is as follows as shown, this is not what we want, because this situation will cause the resource to fail to start on other nodes when switching clusters. In other words, when the stop-orphan-resources attribute is set to false, how to make the start state of the resource "Started node0" instead of "RPHANED Started node0"?

Online: [ node0 node1 ]

Full list of resources:

XX_pool_H (ocf::heartbeat:zpool): ORPHANED Started node0 (unmanaged)

The version information of Linux, pacemaker and corosync is as follows:
Linux version 3.10.0-693.el7.x86_64、pacemaker-1.1.23-1.el7_9.1.x86_64、corosync-2.4.5-7.el7.x86_64
Thank you advance for sharing your experience here.

Responses