9.3.3. Total Cluster Failure
standalone: not part of a HA clusterjoining: newly started backup, not yet joined to the cluster.catch-up: backup has connected to the primary and is downloading queues, messages etc.ready: backup is connected and actively replicating from primary, it is ready to take over.recovering: newly-promoted to primary, waiting for backups to catch up before serving clients. Only a single primary broker can be recovering at a time.active: serving clients, only a single primary broker can be active at a time.
All brokers are in joining or catch-up mode. rgmanager tries to promote a new primary but cannot find any candidates and so gives up. clustat will show that the qpidd services are running but the the qpidd-primary service has stopped, something like this:
Table 9.3.
| Service Name | Owner (Last) | State |
|---|---|---|
|
service:mrg33-qpidd-service
|
20.0.10.33
|
started
|
|
service:mrg34-qpidd-service
|
20.0.10.34
|
started
|
|
service:mrg35-qpidd-service
|
20.0.10.35
|
started
|
|
service:qpidd-primary-service
|
(20.0.10.33)
|
stopped
|
qpid-ha status --all.
- In luci:<your-cluster>:Nodes click reboot to restart the entire cluster.
- or stop and restart the cluster with
ccs --stopall;ccs --startall
- In luci:<your-cluster>:Service Groups:
- select all the qpidd (not primary) services, click restart.
- select the qpidd-primary service, click restart.
- or stop the primary and qpidd services with clusvcadm, then restart (primary last)
A new primary is promoted and the cluster is functional. All non-persistent data from before the failure is lost.

Where did the comment section go?
Red Hat's documentation publication system recently went through an upgrade to enable speedier, more mobile-friendly content. We decided to re-evaluate our commenting platform to ensure that it meets your expectations and serves as an optimal feedback mechanism. During this redesign, we invite your input on providing feedback on Red Hat documentation via the discussion platform.