One or more nodes is killed in a RHEL 5 cluster after nodes report "openais[xxxx]: Killing node nodeX because it has rejoined the cluster with existing state"

Solution Verified - Updated 2024-08-07T05:54:23+00:00 -

Issue

Both nodes in the cluster killed each other simultaneously after a token loss:

openais[3707]: [MAIN ] Killing node node2.example.com because it has rejoined the cluster with existing state 
openais[3707]: [CMAN ] cman killed by node 2 because we rejoined the cluster without a full restart 
gfs_controld[2759]: cluster is down, exiting
dlm_controld[2753]: cluster is down, exiting
kernel: dlm: closing connection to node 2

Why did my two node cluster go down after a network failure?

There was a network split in my cluster, followed by the network then recovering, and a node was killed because it "rejoined the cluster without a full restart".

Nov  6 23:30:20 node1 openais[13068]: [MAIN ] Killing node node2 because it has rejoined the cluster with existing state 
Nov  6 23:30:20 node1 openais[13068]: [CMAN ] cman killed by node 2 because we rejoined the cluster without a full restart 

Nov  6 23:30:20 node2 openais[13382]: [MAIN ] Killing node node1 because it has rejoined the cluster with existing state 
Nov  6 23:30:20 node2 openais[13382]: [CMAN ] cman killed by node 1 because we rejoined the cluster without a full restart

service cman status command shows "groupd dead but pid file exists", and cluster could not be formed:
```
# service cman status
groupd dead but pid file exists
```
Have issue with two node Red Hat High Availability Cluster, both machines are going shutdown by fencing each other.
Red Hat Cluster suit issue, we found cluster is not quorate and refusing connection. Cluster servers fence all the nodes, what causing the cluster fence each other?

Environment

Red Hat Enterprise Linux (RHEL) 5 with the High Availability Add On
- A similar issue exists in Red Hat Enterprise Linux 6 clusters

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

One or more nodes is killed in a RHEL 5 cluster after nodes report "openais[xxxx]: Killing node nodeX because it has rejoined the cluster with existing state"

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links