Chapter 2. Using the KahaDB Message Store

Abstract

The KahaDB Message Store is the default message store used by Red Hat JBoss A-MQ. It is a light-weight transactional store that is fast and reliable. It uses a hybrid system that couples a transactional journal for message storage and a reference store for quick retrieval.
Important
If you use antivirus software it can interfere with Red Hat JBoss A-MQ's ability to access the files in the KahaDB message store. You should configure your antivirus software to skip the KahaDB data folders when doing automatic scans.

2.1. Understanding the KahaDB Message Store

Overview

The KahaDB message store is the default persistence store used by Red Hat JBoss A-MQ. It is a file-based persistence adapter that is optimized for maximum performance. The main features of KahaDB are:
  • journal-based storage so that messages can be rapidly written to disk
  • allows for the broker to restart quickly
  • storing message references in a B-tree index which can be rapidly updated at run time
  • full support for JMS transactions
  • various strategies to enable recovery after a disorderly shutdown of the broker

Architecture

The KahaDB message store is an embeddable, transactional message store that is fast and reliable. It is an evolution of the AMQ message store used by Apache ActiveMQ 5.0 to 5.3. It uses a transactional journal to store message data and a B-tree index to store message locations for quick retrieval.
Figure 2.1, “Overview of the KahaDB Message Store” shows a high-level view of the KahaDB message store.

Figure 2.1. Overview of the KahaDB Message Store

the KahaDB message store has disk-based data logs that support an indexed in-memory cache
Messages are stored in file-based data logs. When all of the messages in a data log have been successfully consumed, the data log is marked as deletable. At a predetermined clean-up interval, logs marked as deletable are either removed from the system or moved to an archive.
An index of message locations is cached in memory to facilitate quick retrieval of message data. At configurable checkpoint intervals, the references are inserted into the metadata store.

Data logs

The data logs are used to store data in the form of journals, where events of all kinds—messages, acknowledgments, subscriptions, subscription cancellations, transaction boundaries, etc.— are stored in a rolling log. Because new events are always appended to the end of the log, a data log file can be updated extremely rapidly.
Implicitly, the data logs contain all of the message data and all of the information about destinations, subscriptions, transactions, etc.. This data, however, is stored in an arbitrary manner. In order to facilitate rapid access to the content of the logs, the message store constructs metadata to reference the data embedded in the logs.

Metadata cache

The metadata cache is an in-memory cache consisting mainly of destinations and message references. That is, for each JMS destination, the metadata cache holds a tree of message references, giving the location of every message in the data log files. Each message reference maps a message ID to a particular offset in one of the data log files. The tree of message references is maintained using a B-tree algorithm, which enables rapid searching, insertion, and deletion operations on an ordered list of messages.
The metadata cache is periodically written to the metadata store on the file system. This procedure is known as check pointing and the length of time between checkpoints is configurable using the checkpointInterval configuration attribute. For details on how to configure the metadata cache, see Section 2.4, “Optimizing the Metadata Cache”.

Metadata store

The metadata store contains the complete broker metadata, consisting mainly of a B-tree index giving the message locations in the data logs. The metadata store is written to a file called db.data, which is periodically updated from the metadata cache.
The metadata store duplicates data that is already stored in the data logs (in a raw, unordered form). The presence of the metadata store, however, enables the broker instance to restart rapidly. If the metadata store got damaged or was accidentally deleted, the broker could recover by reading the data logs, but the restart would then take a considerable length of time.