One or more nodes is killed in a RHEL 6 cluster after nodes report "fenced[xxxx]: telling cman to remove nodeid 2 from cluster"

Solution Verified - Updated -

Issue

  • After a network problem disrupted communications, which were then recovered, one node killed the other:
Feb  6 17:10:50 node1 corosync[1827]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Feb  6 17:10:50 node1 corosync[1827]:   [QUORUM] Members[2]: 1 2
Feb  6 17:10:50 node1 corosync[1827]:   [QUORUM] Members[2]: 1 2
Feb  6 17:10:50 node1 corosync[1827]:   [CPG   ] chosen downlist: sender r(0) ip(172.22.0.210) ; members(old:1 left:0)
Feb  6 17:10:50 node1 corosync[1827]:   [MAIN  ] Completed service synchronization, ready to provide service.
Feb  6 17:10:50 node1 gfs_controld[1959]: receive_start 2:4 add node with started_count 3
Feb  6 17:10:50 node1 dlm_controld[1910]: receive_start 2:4 add node with started_count 3
Feb  6 17:10:50 node1 dlm_controld[1910]: receive_start 2:4 add node with started_count 3
Feb  6 17:10:50 node1 dlm_controld[1910]: receive_start 2:4 add node with started_count 3
Feb  6 17:10:51 node1 rgmanager[2586]: State change: node2 UP
Feb  6 17:10:53 node1 fenced[1884]: telling cman to remove nodeid 2 from cluster
Feb  6 17:10:30 node2 corosync[1984]: cman killed by node 1 because we were killed by cman_tool or other application
Feb  6 17:10:30 node2 fenced[2041]: telling cman to remove nodeid 1 from cluster
  • Both nodes in a two node cluster killed each other, reporting "fenced[xxxx]: telling cman to remove nodeid X from cluster"

Environment

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content