qdiskd takes a long time to initialize and cluster is inquorate while waiting in RHEL 5 or 6

Solution Unverified - Updated -

Issue

  • When I start one node in my cluster by itself, I expect it to gain quorum with the quorum device votes. But it looks like it takes a long time for qdiskd to finish Initializing, and so the node stays inquorate for a long time:
Nov 11 15:42:09 hostname qdiskd[13063]: <info> Quorum Partition: /dev/disk/by-id/scsi-1IET_00020002 Label: rummyqdisk 
Nov 11 15:42:09 hostname qdiskd[13064]: <info> Quorum Daemon Initializing 
Nov 11 15:42:09 hostname qdiskd[13064]: <debug> I/O Size: 512  Page Size: 4096 
Nov 11 15:42:10 hostname qdiskd[13064]: <info> Heuristic: '/bin/ping -c1 -w1 192.168.143.1' UP 
Nov 11 15:42:10 hostname ccsd[13038]: Cluster is not quorate.  Refusing connection. 
Nov 11 15:42:10 hostname ccsd[13038]: Error while processing connect: Connection refused 
[...]
Nov 11 15:42:49 hostname qdiskd[13064]: <info> Initial score 1/1 
Nov 11 15:42:49 hostname qdiskd[13064]: <info> Initialization complete 
Nov 11 15:42:49 hostname openais[13045]: [CMAN ] quorum device registered 
Nov 11 15:42:49 hostname qdiskd[13064]: <notice> Score sufficient for master operation (1/1; required=1); upgrading 
Nov 11 15:42:49 hostname ccsd[13038]: Cluster is not quorate.  Refusing connection. 
Nov 11 15:42:49 hostname ccsd[13038]: Error while processing connect: Connection refused 
[...]
Nov 11 15:43:09 hostname qdiskd[13064]: <debug> Making bid for master 
Nov 11 15:43:09 hostname ccsd[13038]: Cluster is not quorate.  Refusing connection. 
Nov 11 15:43:09 hostname ccsd[13038]: Error while processing connect: Connection refused 
[...]
Nov 11 15:43:29 hostname qdiskd[13064]: <info> Assuming master role 
Nov 11 15:43:29 hostname ccsd[13038]: Cluster is not quorate.  Refusing connection. 
Nov 11 15:43:29 hostname ccsd[13038]: Error while processing connect: Connection refused 
[...]
Nov 11 15:43:39 hostname openais[13045]: [CMAN ] quorum regained, resuming activity 
  • When I start some nodes in the cluster (but not all), I see a lot of "Cluster is not quorate" messages and then other services like clvmd, gfs2, and rgmanager fail to start

Environment

  • Red Hat Enterprise Linux (RHEL) 5 or 6 with the High Availability Add On
  • Cluster configured with a quorum device (<quorumd> in /etc/cluster/cluster.conf)
  • Cluster nodes starting up asynchronously (not at the same time).
  • cman releases prior to 3.0.12.1-68.el6

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content