RHEL High Availability cluster nodes will not join when using Cisco network switches

Updated 2015-02-08T02:36:10+00:00

Issue

Problem

  • Nodes are online and communicating but can't form a quorate cluster
  • Nodes start cman but take several minutes to detect the other node via multicast (openais):
    
    Node1:
    Oct 19 04:34:41 node01 openais[4000]: [CLM  ] CLM CONFIGURATION CHANGE
    Oct 19 04:34:41 node01 openais[4000]: [CLM  ] New Configuration:
    Oct 19 04:34:41 node01 openais[4000]: [CLM  ]  r(0) ip(10.16.177.21)
    Oct 19 04:34:41 node01 openais[4000]: [CLM  ] Members Left:
    Oct 19 04:34:41 node01 openais[4000]: [CLM  ] Members Joined:
    Oct 19 04:34:41 node01 openais[4000]: [CLM  ]  r(0) ip(10.16.177.21)
    Oct 19 04:34:41 node01 openais[4000]: [SYNC ] This node is within the primary component and will provide service.
    Oct 19 04:34:41 node01 openais[4000]: [TOTEM] entering OPERATIONAL state.
    Oct 19 04:34:41 node01 openais[4000]: [CMAN ] quorum regained, resuming activity
    Oct 19 04:34:41 node01 openais[4000]: [CLM  ] got nodejoin message 10.16.177.21
    
    Node2:
    Oct 19 04:34:45 node02 openais[3365]: [CLM  ] CLM CONFIGURATION CHANGE 
    Oct 19 04:34:45 node02 openais[3365]: [CLM  ] New Configuration: 
    Oct 19 04:34:45 node02 openais[3365]: [CLM  ]  r(0) ip(10.16.177.22)  
    Oct 19 04:34:45 node02 openais[3365]: [CLM  ] Members Left: 
    Oct 19 04:34:45 node02 openais[3365]: [CLM  ] Members Joined: 
    Oct 19 04:34:45 node02 openais[3365]: [CLM  ]  r(0) ip(10.16.177.22)  
    Oct 19 04:34:45 node02 openais[3365]: [SYNC ] This node is within the primary component and will provide service. 
    Oct 19 04:34:45 node02 openais[3365]: [TOTEM] entering OPERATIONAL state. 
    Oct 19 04:34:45 node02 openais[3365]: [CMAN ] quorum regained, resuming activity 
    Oct 19 04:34:45 node02 openais[3365]: [CLM  ] got nodejoin message 10.16.177.22 
    Then 3 minutes later the nodes see each other:
    
    Oct 19 04:37:55 node01 openais[4000]: [CLM  ] CLM CONFIGURATION CHANGE
    Oct 19 04:37:55 node01 openais[4000]: [CLM  ] New Configuration:
    Oct 19 04:37:55 node01 openais[4000]: [CLM  ]  r(0) ip(10.16.177.21)
    Oct 19 04:37:55 node01 openais[4000]: [CLM  ]  r(0) ip(10.16.177.22)
    Oct 19 04:37:55 node01 openais[4000]: [CLM  ] Members Left:
    Oct 19 04:37:55 node01 openais[4000]: [CLM  ] Members Joined:
    Oct 19 04:37:55 node01 openais[4000]: [CLM  ]  r(0) ip(10.16.177.22)
    Oct 19 04:37:55 node01 openais[4000]: [SYNC ] This node is within the primary component and will provide service.
    Oct 19 04:37:55 node01 openais[4000]: [TOTEM] entering OPERATIONAL state.
    Oct 19 04:37:55 node01 openais[4000]: [MAIN ] Killing node node02 because it has rejoined the cluster with existing state
    Oct 19 04:37:55 node01 openais[4000]: [CMAN ] cman killed by node 2 because we rejoined the cluster without a full restart
    
    Oct 19 04:37:55 node02 openais[3365]: [CLM  ] CLM CONFIGURATION CHANGE
    Oct 19 04:37:55 node02 openais[3365]: [CLM  ] New Configuration:
    Oct 19 04:37:55 node02 openais[3365]: [CLM  ]  r(0) ip(10.16.177.21)
    Oct 19 04:37:55 node02 openais[3365]: [CLM  ]  r(0) ip(10.16.177.22)
    Oct 19 04:37:55 node02 openais[3365]: [CLM  ] Members Left:
    Oct 19 04:37:55 node02 openais[3365]: [CLM  ] Members Joined:
    Oct 19 04:37:55 node02 openais[3365]: [CLM  ]  r(0) ip(10.16.177.21)
    Oct 19 04:37:55 node02 openais[3365]: [SYNC ] This node is within the primary component and will provide service.
    Oct 19 04:37:55 node02 openais[3365]: [TOTEM] entering OPERATIONAL state.
    Oct 19 04:37:55 node02 openais[3365]: [MAIN ] Killing node node01 because it has rejoined the cluster with existing state
    Oct 19 04:37:55 node02 openais[3365]: [CMAN ] cman killed by node 1 because we rejoined the cluster without a full restart
    

Environment

  • Red Hat Enterprise Linux (RHEL) 5, 6, or 7 with the High Availability Add On
  • Red Hat Cluster Suite (RHCS) 4
  • Multicast communications
  • Network used for cluster communication contains a Cisco switch

Subscriber content preview. For full access to the Red Hat Knowledgebase, please log in.

Not a subscriber? Learn more about the benefits of Red Hat Subscriptions.