Chapter 6. High availability

6.1. High availability overview

The term high availability refers to a system that is capable of remaining operational, even when part of that system fails or is taken offline. With Broker on OCP, specifically, HA refers to both maintaining the availability of brokers and the integrity of the messaging data if a broker fails.

In an HA configuration on AMQ Broker on OpenShift Container Platform, you run multiple instances of a broker pod simultaneously. Each individual broker pod writes its message data to a persistent volume (PVs), which logically define the storage volumes in the system. If a broker pod fails or is taken offline, the message data stored in that PV is redistributed to an alternative available broker, which then stores it in its own PV.

Figure 6.1. StatefulSet working normally

ah ocp pod draining

When you take a broker pod offline, the StatefulSet is scaled down and you must manage what happens to the message data in the unattached PV. To migrate the messages held in the PV associated with the now-offline pod, you use the scaledown controller. The process of migrating message data in this fashion is sometimes referred to as pod draining.

Additional resources

  • To enable High Availability on AMQ Broker on OpenShift Container Platform, use the StatefulSets tutorial.

6.2. Message migration

Pod draining is the method used to manage the integrity of messaging data in a Kubernetes StatefulSet. Pod draining is used for message migration, which refers to the removal and redistribution of "orphaned" messages from a persistent volume, due to broker pod failure or intentional scale down.

Figure 6.2. One of the brokers has gone down

ah ocp pod draining 2

6.3. How does pod draining and message migration work?

A Kubernetes StatefulSet scaledown controller image exists for AMQ Broker on OpenShift Container Platform. It runs within the same project namespace as the broker pods.

The scaledown controller registers itself and listens for Kubernetes events that are related to persistent volume claims (PVCs) in the project (openshift) namespace.

The scaledown controller checks for PVCs that have been orphaned by looking at the ordinal on the volume claim. The ordinal on the volume claim is compared to the ordinal on the existing broker pods, which are part of the StatefulSet in the project namespace.

If the ordinal on the volume claim is greater than the ordinal on the existing broker pods, then the pod at that ordinal has been terminated and the data must be migrated to another broker.

When these conditions are met, a drainer pod is started. The drainer pod runs the broker and executes the message migration. Then the drainer pod identifies an alternative broker pod to which the orphaned PVC messages can be migrated.

Figure 6.3. The scaledown controller registers itself, deletes the PVC, and redistributes messages on the PersistentVolume.

ah ocp pod draining 3

After the messages are successfully migrated to an operational broker pod, the drainer pod shuts down and the scaledown controller removes the orphaned PVC.