public class HAManager
extends Object
Handles HA
We compute failover and whether there is a quorum synchronously as we receive nodeAdded and nodeRemoved events
from the cluster manager.
It's vital that this is done synchronously as the cluster manager only guarantees that the set of nodes retrieved
from getNodes() is the same for each node in the cluster when processing the exact same nodeAdded/nodeRemoved
event.
As HA modules are deployed, if a quorum has been attained they are deployed immediately, otherwise the deployment
information is added to a list.
Periodically we check the value of attainedQuorum and if true we deploy any HA deploymentIDs waiting for a quorum.
If false, we check if there are any HA deploymentIDs current deployed, and if so undeploy them, and add them to the list
of deploymentIDs waiting for a quorum.
By doing this check periodically we can avoid race conditions resulting in modules being deployed after a quorum has
been lost, and without having to resort to exclusive locking which is actually quite tricky here, and prone to
deadlock·
We maintain a clustered map where the key is the node id and the value is some stringified JSON which describes
the group of the cluster and an array of the HA modules deployed on that node.
There is an entry in the map for each node of the cluster.
When a node joins the cluster or an HA module is deployed or undeployed that entry is updated.
When a node leaves the cluster cleanly, it removes it's own entry before leaving.
When the cluster manager sends us an event to say a node has left the cluster we check if its entry in the cluster
map is there, and if so we infer a clean close has happened and no failover will occur.
If the map entry is there it implies the node died suddenly. In that case each node of the cluster must compute
whether it is the failover node for the failed node.
First each node of the cluster determines whether it is in the same group as the failed node, if not then it will not
be a candidate for the failover node. Nodes in the cluster only failover to other nodes in the same group.
If the node is in the same group then the node takes the UUID of the failed node, computes the hash-code and chooses
a node from the list of nodes in the cluster by taking the hash-code modulo the number of nodes as an index to the
list of nodes.
The cluster manager guarantees each node in the cluster sees the same set of nodes for each membership event that is
processed. Therefore it is guaranteed that each node in the cluster will compute the same value. It is critical that
any cluster manager implementation provides this guarantee!
Once the value has been computed, it is compared to the current node, and if it is the same the current node
assumes failover for the failed node.
During failover the failover node deploys all the HA modules from the failed node, as described in the JSON with the
same values of config and instances.
Once failover is complete the failover node removes the cluster map entry for the failed node.
If the failover node itself fails while it is processing failover for another node, then this is also checked by
other nodes when they detect the failure of the second node.
- Author:
- Tim Fox