Show Table of Contents
Chapter 1. Introduction to Transactions
This chapter defines some basic transaction concepts and explains how to generate and build a simple transactional JMS example in Apache Camel.
1.1. Basic Transaction Concepts
What is a transaction?
The prototype of a transaction is an operation that conceptually consists of a single step (for example, transfer money from account A to account B), but must be implemented as a series of steps. Clearly, such operations are acutely vulnerable to system failures, because a crash is likely to leave some of the steps unfinished, leaving the system in an inconsistent state. For example, if you consider the operation of transferring money from account A to account B: if the system crashes after debiting account A, but before crediting account B, the net result is that some money disappears into thin air.
In order to make such an operation reliable, it must be implemented as a transaction. On close examination, it turns out that there are four key properties a transaction must have in order to guarantee reliable execution: these are the so-called ACID properties of a transaction.
ACID properties of a transaction
The ACID properties of a transaction are defined as follows:
- Atomic—a transaction is an all or nothing procedure; individual updates are assembled and either committed or aborted (rolled back) simultaneously when the transaction completes.
- Consistent—a transaction is a unit of work that takes a system from one consistent state to another.
- Isolated—while a transaction is executing, its partial results are hidden from other entities accessing the transaction.
- Durable—the results of a transaction are persistent.
A transaction client is an API or object that enables you to initiate and end transactions. Typically, a transaction client exposes operations that enable you to begin, commit, or roll back a transaction. In the context of the Spring framework, the
PlatformTransactionManagerexposes a transaction client API.
Transaction demarcation refers to the initiating and ending of transactions (where transactions can be ended either by being committed or rolled back). Demarcation can be effected either explicitly (for example, by calling a transaction client API) or implicitly (for example, whenever a message is polled from a transactional endpoint).
A resource is any component of a computer system that can undergo a persistent or permanent change. In practice, a resource is almost always a database or a service layered over a database (for example, a message service with persistence). Other kinds of resource are conceivable, however. For example, an Automated Teller Machine (ATM) is a kind of resource: once a customer has physically accepted cash from the machine, the transaction cannot be reversed.
A transaction manager is responsible for coordinating transactions across one or more resources. In many cases, a transaction manager is built into a resource. For example, enterprise-level databases generally include a transaction manager that is capable of managing transactions involving that database. But for transactions involving more than one resource, it is normally necessary to employ an external transaction manager implementation.
Managing single or multiple resources
For transactions involving a single resource, the transaction manager built into the resource can generally be used. For transactions involving multiple resources, however, it is necessary to use an external transaction manager or a transaction processing (TP) monitor. In this case, the resources must be integrated with the transaction manager by registering their XA switches. There is also an important difference between the types of algorithm that are used for committing single-resource systems and multiple-resource systems, as follows:
- 1-phase commit—suitable for single-resource systems, this protocol commits a transaction in a single step.
- 2-phase commit—suitable for multiple-resource systems, this protocol commits a transaction in two steps. Including multiple resources in a transaction introduces an extra element of risk: there is the danger that a system failure might occur after some, but not all, of the resources have been committed. This would leave the system in an inconsistent state. The 2-phase commit protocol is designed to eliminate this risk, ensuring that the system can always be restored to a consistent state after it is restarted.
Transactions and threading
To understand transaction processing, it is crucial to appreciate the basic relationship between transactions and threads: transactions are thread-specific. That is, when a transaction is started, it is attached to a specific thread (technically, a transaction context object is created and associated with the current thread). From this point on (until the transaction ends), all of the activity in the thread occurs within this transaction scope. Conversely, activity in any other thread does not fall within this transaction's scope (although it might fall within the scope of some other transaction).
From this, we can draw a few simple conclusions:
- An application can process multiple transactions simultaneously—as long as each of the transactions are created in separate threads.
- Beware of creating subthreads within a transaction—if you are in the middle of a transaction and you create a new pool of threads (for example, by calling the
threads()DSL command), the new threads are not in the scope of the original transaction.
- Beware of processing steps that implicitly create new threads—for the same reason given in the preceding point.
- Transaction scopes do not usually extend across route segments—that is, if one route segment ends with
to(JoinEndpoint)and another route segment starts with
from(JoinEndpoint), these route segments typically do not belong to the same transaction. There are exceptions, however (see the section called “Breaking a route into fragments”).
Some advanced transaction manager implementations give you the freedom to detach and attach transaction contexts to and from threads at will. For example, this makes it possible to move a transaction context from one thread to another thread. In some cases it is also possible to attach a single transaction context to multiple threads.
A transaction context is an object that encapsulates the information needed to keep track of a transaction. The format of a transaction context depends entirely on the relevant transaction manager implementation. At a minimum, the transaction context contains a unique transaction identifier.
A distributed transaction refers to a transaction in a distributed system, where the transaction scope spans multiple network nodes. A basic prerequisite for supporting distributed transactions is a network protocol that supports transmission of transaction contexts in a canonical format (see also, the section called “Distributed transaction managers”). Distributed transaction lie outside the scope of Apache Camel transactions.
X/Open XA standard
The X/Open XA standard describes a standardized interface for integrating resources with a transaction manager. If you want to manage a transaction that includes more than one resource, it is essential that the participating resources support the XA standard. Resources that support the XA standard expose a special object, the XA switch, which enables transaction managers (or TP monitors) to take control of their transactions. The XA standard supports both the 1-phase commit protocol and the 2-phase commit protocol.