Chapter 3. Master/Slave

Abstract

Persistent messages require an additional layer of fault tolerance. In case of a broker failure, persistent messages require that the replacement broker has a copy of all the undelivered messages. Master/slave groups address this requirement by having a standby broker that shares the active broker's data store.
A master/slave group consists of two or more brokers where one master broker is active and one or more slave brokers are on hot standby, ready to take over whenever the master fails or shuts down. All of the brokers store the message and event data processed by the master broker. So, when one of the slaves takes over as the new master broker the integrity of the messaging system is guaranteed.
Red Hat JBoss A-MQ supports two master/slave broker configurations:
  • Shared file system—the master and the slaves use a common persistence store that is located on a shared file system
  • Shared JDBC database—the masters and the slaves use a common JDBC persistence store

3.1. Shared File System Master/Slave

Overview

A shared file system master/slave group works by sharing a common data store that is located on a shared file system. Brokers automatically configure themselves to operate in master mode or slave mode based on their ability to grab an exclusive lock on the underlying data store.
The disadvantage of this configuration is that the shared file system is a single point of failure. This disadvantage can be mitigated by using a storage area network (SAN) with built in high availability (HA) functionality. The SAN will handle replication and fail over of the data store.

Supported network file systems

The following network file systems (and only these file systems) are supported by JBoss A-MQ:
  • NFSv4
  • GFS2
  • CIFS/SMB (Windows only)

Recommended NFSv4 client mount options

The goal is to set mount options that provide optimal support for both broker failover and data persistence. For broker failover, you want errors to propagate up the broker quickly. For data persistence, you want to resend failed requests many times. The trick is to find settings that optimally balance both fault tolerant messaging features.
The following mount options were used in all NFS locking mechanism tests. The tests were run on Red Hat Enterprise Linux 7.x machines in Red Hat OpenStack Platform. The broker was configured to use the KahaDB store and with lockKeepAlivePeriod=2000 (for details, see the section called “File locking requirements”). In these tests, the broker detected lost access to the data store and initiated shutdown within 12 seconds. You may need to adjust these settings depending on your particular setup.
  • soft—Disables continuous retransmission attempts by the client when the NFS server does not respond to a request. Instead, an NFS request fails after retrans transmissions have been sent, causing the NFS client to return an error to the calling client, and thus the broker. This option is key for enabling the timeo and retrans options.
  • timeo=20—The time, in deciseconds, the NFS client waits for a response from the NFS server, before it sends another request. The default is 600 (60 seconds).
  • retrans=2—Specifies the number of times the NFS client attempts to retransmit a failed request to the NFS server. The default is 3. The client waits a timeo timeout period between each retrans attempt.
    Note
    After each retransmission, the timeout period is incremented by timeo, up to the maximum allowed (600).
  • lookupcache=none—Specifies how the kernel manages its cache of directory entries for the mount point. none forces the client to revalidate all cache entries before they are used. This enables the Master broker to immediately detect any change made to the lock file, and it prevents the lock checking mechanism from returning incorrect validity information.
    The default is all, which means the client assumes that all cache directory entries are valid until their parent directory's cached attributes expire.
  • sync—Any system call that writes data to files on the mount point causes the data to be flushed to the NFS server before the system call returns control to user space. This option provides greater data cache coherence.
  • intr—Allows signals to interrupt file operations on the mount point. System calls return EINTR when an in-progress NFS operation is interrupted by a signal.
  • proto=tcp—Specifies the protocol the NFS client uses to transmit requests to the NFS server.
For more information on NFS mount point options, see http://linux.die.net/man/5/nfs.

File locking requirements

The shared file system requires an efficient and reliable file locking mechanism to function correctly. Not all SAN file systems are compatible with the shared file system configuration's needs.
Warning
OCFS2 is incompatible with this master/slave configuration, because mutex file locking from Java is not supported.
Warning
NFSv3 is incompatible with this master/slave configuration. In the event of an abnormal termination of a master broker that is an NFSv3 client, the NFSv3 server does not time out the lock held by the client. This renders the Red Hat JBoss A-MQ data directory inaccessible. Because of this, the slave broker cannot acquire the lock and therefore cannot start up. In this case, the only way to unblock the master/slave in NFSv3 is to reboot all broker instances.
NFSv4, on the other hand, is compatible with this master/slave configuration, as its design includes timeouts for locks. When an NFSv4 client holding a lock terminates abnormally, NFSv4 automatically releases the lock after the specified timeout (see http://tools.ietf.org/html/rfc5661 for details), allowing another NFSv4 client to grab it.
It is possible for a slave to grab the lock from the master without the master's knowledge when NFSv4 crashes. This is so because the master broker does not automatically check whether it still has the lock, giving a slave the chance to grab it when the NFSv4 specified timeout elapses.
You can avoid this scenario by using the persistence adapter's lockKeepAlivePeriod attribute. Setting the lockKeepAlivePeriod attribute instructs the master to check, at intervals of the specified milliseconds, whether it still holds the lock (lock is valid) and that the lock file still exists. If the master discovers that the lock is invalid, it tries to regain the lock. If it fails or the lock file no longer exists, the master enters Slave mode, allowing another slave to try to get the lock and become master.
In attempting to get the lock, the slave checks every lockAcquireSleepInterval (milliseconds) whether another broker holds the lock. If not, the slave locks the file and waits one lockKeepAlivePeriod before entering Master mode. If the lock file does not exist, the slave recreates it and then tries to lock it, following the same procedure it would if the lock file existed.
To enable this lock checking mechanism, add the lockKeepAlivePeriod attribute to the persistence Adaptor element in the broker configuration. For example, like this:
<kahaDB directory="/sharedFileSystem/sharedBrokerData" lockKeepAlivePeriod="5000" />
which instructs the master broker to check at five second intervals whether the lock is still valid and that the lock file exists. Example 3.1, “Shared File System Broker Configuration” shows how to set the lockAcquireSleepInterval attribute.

Initial state

Figure 3.1, “Shared File System Initial State” shows the initial state of a shared file system master/slave group. When all of the brokers are started, one of them grabs the exclusive lock on the broker data store and becomes the master. All of the other brokers remain slaves and pause while waiting for the exclusive lock to be freed up. Only the master starts its transport connectors, so all of the clients connect to it.

Figure 3.1. Shared File System Initial State

a master and two slaves using a shared file system

State after failure of the master

Figure 3.2, “Shared File System after Master Failure” shows the state of the master/slave group after the original master has shut down or failed. As soon as the master gives up the lock (or after a suitable timeout, if the master crashes), the lock on the data store frees up and another broker grabs the lock and gets promoted to master.

Figure 3.2. Shared File System after Master Failure

master with a single slave
After the clients lose their connection to the original master, they automatically try all of the other brokers listed in the failover URL. This enables them to find and connect to the new master.

Configuring the brokers

In the shared file system master/slave configuration, there is nothing special to distinguish a master broker from the slave brokers. The membership of a particular master/slave group is defined by the fact that all of the brokers in the group use the same persistence layer and store their data in the same shared directory.
Example 3.1, “Shared File System Broker Configuration” shows the broker configuration for a shared file system master/slave group that shares a data store located at /sharedFileSystem/sharedBrokerData and uses the KahaDB persistence store.

Example 3.1. Shared File System Broker Configuration

<broker ... >
  ...
  <persistenceAdapter>
    <kahaDB directory="/sharedFileSystem/sharedBrokerData" lockKeepAlivePeriod="5000">
        <locker>
            <shared-file-locker lockAcquireSleepInterval="10000" />
        </locker>
     </kahaDB>
  </persistenceAdapter>
  ...
</broker>
All of the brokers in the group must share the same persistenceAdapter element.

Configuring the clients

Clients of shared file system master/slave group must be configured with a failover URL that lists the URLs for all of the brokers in the group. Example 3.2, “Client URL for a Shared File System Master/Slave Group” shows the client failover URL for a group that consists of three brokers: broker1, broker2, and broker3.

Example 3.2. Client URL for a Shared File System Master/Slave Group

failover:(tcp://broker1:61616,tcp://broker2:61616,tcp://broker3:61616)
For more information about using the failover protocol see Section 2.1.1, “Static Failover”.

Reintroducing a failed node

You can restart the failed master at any time and it will rejoin the cluster. It will rejoin as a slave broker because one of the other brokers already owns the exclusive lock on the data store, as shown in Figure 3.3, “Shared File System after Master Restart”.

Figure 3.3. Shared File System after Master Restart

a master with two slaves broker1 is now a slave