6.2. KahaDB Optimization

Overview

The Red Hat JBoss A-MQ message store has undergone a process of evolution. Currently, the KahaDB message store is the default (and recommended) message store, while the AMQ message store and the (original) kaha message store represent earlier generations of message store technology.

KahaDB architecture

The KahaDB architecture—as shown in Figure 6.3, “KahaDB Architecture”—is designed to facilitate high-speed message storage and retrieval. The bulk of the data is stored in rolling journal files (data logs), where all broker events are continuously appended. In particular, pending messages are also stored in the data logs.

Figure 6.3. KahaDB Architecture

KahaDB Architecture
In order to facilitate rapid retrieval of messages from the data logs, a B-tree index is created, which contains pointers to the locations of all the messages embedded in the data log files. The complete B-tree index is stored on disk and part or all of the B-tree index is held in a cache in memory. Evidently, the B-tree index can work more efficiently, if the complete index fits into the cache.

Sample configuration

The following example shows how to configure a broker to use the KahaDB message store, by adding a persistenceAdapter element containing a kahaDB child element:
<broker brokerName="broker" persistent="true" useShutdownHook="false">
  ...
  <persistenceAdapter>
 <kahaDB directory="activemq-data" journalMaxFileLength="32mb"/>
  </persistenceAdapter>
</broker>
The directory property specifies the directory where the KahaDB files are stored and the journalMaxFileLength specifies the maximum size of a data log file.

Performance optimization

You can optimize the performance of the KahaDB message store by modifying the following properties (set as attributes on the kahaDB element):
  • indexCacheSize—(default 10000) specifies the size of the cache in units of pages (where one page is 4 KB by default). Generally, the cache should be as large as possible, to avoid swapping pages in and out of memory. Check the size of your metadata store file, db.data, to get some idea of how big the cache needs to be.
  • indexWriteBatchSize—(default 1000) defines the threshold for the number of dirty indexes that are allowed to accumulate, before KahaDB writes the cache to the store. If you want to maximize the speed of the broker, you could set this property to a large value, so that the store is updated only during checkpoints. But this carries the risk of losing a large amount of metadata, in the event of a system failure (causing the broker to restart very slowly).
  • journalMaxFileLength—(default 32mb) when the throughput of a broker is very large, you can fill up a journal file quite quickly. Because there is a cost associated with closing a full journal file and opening a new journal file, you can get a slight performance improvement by increasing the journal file size, so that this cost is incurred less frequently.
  • enableJournalDiskSyncs—(default true) normally, the broker performs a disk sync (ensuring that a message has been physically written to disk) before sending the acknowledgement back to a producer. You can obtain a substantial improvement in broker performance by disabling disk syncs (setting this property to false), but this reduces the reliability of the broker somewhat.
    Warning
    If you need to satisfy the JMS durability requirement and be certain that you do not lose any messages, do not disable journal disk syncs.
For more details about these KahaDB configuration properties, see Configuring Broker Persistence.

Optimizing disk syncs

On some operating systems, you can obtain better performance by configuring KahaDB to call the fdatasync() system call instead of the fsync() system call (the default), when writing to a file. The difference between these system calls is that fdatasync() updates only the file data, whereas fsync() updates both the file data and the file metadata (for example, the access time).
To enable this optimization, add the following system property setting to the etc/system.properties file in your JBoss A-MQ installation:
org.apache.activemq.kahaDB.files.skipMetadataUpdate=true
Note
This optimization might not be effective on all operating systems, because it ultimately relies on the Java Virtual Machine (JVM) implementation to make the fdatasync() system call. When this option is enabled, the JBoss A-MQ runtime actually makes a call to java.nio.channels.FileChannel#force=false. For some JVMs, this can result in a call to fdatasync() (so that the optimization is effective), but with other JVMs it might be implemented using fsync() (so that the optimization has no effect).
Note
For users of Red Hat Enterprise Linux (RHEL), the implementation of fsync() on RHEL 6 is noticeably slower than on RHEL 4 (this is due to a bug fix in RHEL 6). So, this optimization works particularly well on the RHEL 6 platform, where fdatasync() is significantly faster.