Chapter 5. Handling Transaction Manager Exceptions
5.1. Debugging a Timed-out Transaction
There can be many reasons for a transaction timeout, such as:
- Slow server performance
- Thread is stuck waiting for something or hangs up
- Thread needs more than the configured transaction timeout time to complete the processing
You can look at the logs for following error message to identify a timed-out transaction:
WARN ARJUNA012117 "TransactionReaper::check timeout for TX {0} in state {1}"
where {0}
is the Uid of the transaction and {1}
is the transaction manager’s view of the state {1} of the timed-out transaction.
Transaction Manager provides the following options to debug the transaction timeouts:
-
You can configure timeout values for transactions to control the transaction lifetimes. The
transactions
subsystem rolls back the transaction if the timeout value elapses before a transaction terminates because of committing or rolling back. You can use the
setTransactionTimeout
method of theXAResource
interface to propagate the current transaction to the resource manager. If supported, this operation overrides any default timeout associated with the resource manager. Overriding the timeout is useful in situations like the following:- when long-running transactions have lifetimes that exceed the default
- when using the default timeout might cause the resource manager to roll back before the transaction terminates, causing the transaction to roll back as well.
If you do not specify a timeout value or use a value of
0
, transaction manager uses an implementation-specific default value. In JBoss EAP transaction manager, theCoordinatorEnvironmentBean.defaultTimeout
property represents this implementation-specific default value. The default value is300
seconds. A value of0
disables the default transaction timeouts.You can modify the default transaction timeout using the following management CLI command:
/subsystem=transactions:write-attribute(name=default-timeout,value=VALUE)
When running in a managed domain, you must specify which profile to update by preceding the command with
/profile=PROFILE_NAME
-
JBoss EAP Transaction Manager supports an all-or-nothing approach to call the
setTransactionTimeout
method on theXAResource
instances. You can set theJTAEnvironmentBean.xaTransactionTimeoutEnabled
property totrue
, which is the default, to call the method on all the instances. Otherwise, you can use thesetXATransactionTimeoutEnabled
method of thecom.arjuna.ats.jta.common.JTAEnvironmentBean
class to disable timeout and specify them on a per-transaction basis.
5.2. Migrating Logs to a New JBoss EAP Server
Prerequisites
Ensure that the transactions
subsystem is configured identically between the old and the new JBoss EAP. An identical configuration, which includes the list of Jakarta Transactions datasources, is required because any logs that need to be recovered must contact the datasources.
5.2.1. Migrating the File-based Log Storage
To migrate the transaction manager logs to a new JBoss EAP server, you can copy the logs to the new JBoss EAP server.
You can use the following commands to copy the file-based logs:
-
Browse to your
EAP_HOME
directory. Create an archive of the logs using the following command:
$ tar -cf logs.tar ./standalone/data/tx-object-store
Extract the archived logs to the new
EAP_HOME
directory using the following command:$ tar -xf logs.tar -C NEW_EAP_HOME
5.2.2. Migrating the JDBC Store-based Log Storage
- You can configure the new JBoss EAP server to use the old database and tables as described in Using a JDBC as a Transactions Object Store.
Alternatively, you can determine the database and the tables used for the transaction logs. Then, you can use an SQL tool to back up the tables and restore them to the new database.
NoteYou can find an SQL query tool in the
h2
JAR file shipped with JBoss EAP.
5.3. Enabling XTS on JBoss EAP
XML Transaction Service (XTS) component of the transaction manager supports the coordination of private and public web services in a business transaction. XTS provides WS-AT and WS-BA support for web services hosted on the JBoss EAP server. It is an optional subsystem, which can be enabled using the standalone-xts.xml
configuration.
Starting JBoss EAP Server with XTS Enabled
Change to the JBoss EAP server directory:
cd $EAP_HOME
Copy the example XTS configuration file into the
/configuration
directory:cp docs/examples/configs/standalone-xts.xml standalone/configuration
Start the JBoss EAP server, specifying the
xts
configuration:Linux:
bin/standalone.sh --server-config=standalone-xts.xml
Windows:
bin\standalone.bat --server-config=standalone-xts.xml
5.4. Clearing Up Expired Transactions
The following properties allow you to clear up the expired transactions:
ExpiryEntryMonitor
When the Recovery Manager initializes an expiry scanner thread, the
ExpiryEntryMonitor
object is created, which is used to remove dead items from the object store. A number of scanner modules are loaded dynamically, which removes the dead items for a particular type.You can configure the scanner modules in the properties file using the
RecoveryEnvironmentBean.expiryScanners
system property. The scanner modules are loaded at the time of initialization.$ EAP_HOME/bin/standalone.sh -DRecoveryEnvironmentBean.expiryScanners=CLASSNAME1,CLASSNAME2
expiryScanInterval
All the scanner modules are called periodically to scan for dead items by the
ExpiryEntryMonitor
thread. You can configure this period, in hours, using theexpiryScanInterval
system property, as shown in the example below:$ EAP_HOME/bin/standalone.sh -DRecoveryEnvironmentBean.expiryScanInterval=EXPIRY_SCAN_INTERVAL
All scanner modules inherit the same behavior from the ExpiryScanner
interface. This interface provides a scan method that is implemented by all the scanner modules, including the following. The scanner thread calls this scan method.
ExpiredTransactionStatusManagerScanner
The
ExpiredTransactionStatusManagerScanner
removes the deadTransactionStatusManagerItems
from the object store. These items remain in the object store for a certain period before they are deleted, which is 12 hours by default. You can configure this time period, in hours, using thetransactionStatusManagerExpiryTime
system property as shown in the example below:$ EAP_HOME/bin/standalone.sh -DRecoveryEnvironmentBean.transactionStatusManagerExpiryTime=TRANSACTION_STATUS_MANAGER_EXPIRY_TIME
AtomicActionExpiryScanner
The
AtomicActionExpiryScanner
moves transaction logs forAtomicActions
that are assumed to have completed. For example, if a failure occurs after a participant has been told to commit but before thetransactions
subsystem can update the logs, then upon recovery the JBoss EAP transaction manager attempts to replay the commit request. This replay will obviously fail, thus preventing the log from being removed. TheAtomicActionExpiryScanner
is also used when logs cannot be recovered automatically for reasons such as being corrupt or zero length. All logs are moved to a specific location based on the old location appended with/Expired
.NoteAtomicActionExpiryScanner
is disabled by default. You can enable it by adding it to the transaction manager properties file. You need not enable it to cope with corrupt logs.
5.5. Recovering Heuristic Outcomes
A heuristic completion occurs when a transaction resource makes a one-sided decision, during the completion stage of a distributed transaction, to commit or rollback the transaction updates. This can leave distributed data in an indeterminate state. Network failures or resource timeouts are possible causes for heuristic completion. Heuristic completion throws one of the following heuristic outcome exceptions:
HEURISTIC_COMMIT
- This exception is thrown when the transaction manager decides to rollback, but somehow all the resources had already committed on their own. In this case, you need not do anything because a consistent termination was reached.
HEURISTIC_ROLLBACK
-
This exception implies that the resources have all done a rollback because the commit decision from the transaction manager was delayed. Similar to
HEURISTIC_COMMIT
, in this case also you need not do anything because a consistent termination was reached. HEURISTIC_HAZARD
- This exception occurs when the disposition of some of the updates is unknown. For those that are known, they have either all been committed or all rolled back.
HEURISTIC_MIXED
- This exception occurs when some parts of the transaction were rolled back while others were committed.
This procedure shows how to handle a heuristic outcome of a transaction using the Jakarta Transactions.
The cause of a heuristic outcome in a transaction is that a resource manager promised it could commit or rollback, and then failed to fulfill the promise. This could be due to a problem with a third-party component, the integration layer between the third-party component and JBoss EAP, or JBoss EAP itself.
By far, the most common two causes of heuristic errors are transient failures in the environment and coding errors dealing with resource managers.
Usually, if there is a transient failure in your environment, you will know about it before you find out about the heuristic error. This could be due to a network outage, hardware failure, database failure, power outage, or a host of other things.
If you come across a heuristic outcome in a test environment during stress testing, it implies weaknesses in your test environment.
WarningJBoss EAP automatically recovers transactions that were in a non-heuristic state at the time of failure, but it does not attempt to recover the heuristic transactions.
If you have no obvious failure in your environment, or if the heuristic outcome is easily reproducible, it is probably due to a coding error. You must contact the third-party vendors to find out if a solution is available.
If you suspect the problem is in the transaction manager of JBoss EAP itself, you must raise a support ticket.
- You can attempt to recover the transaction manually using the management CLI. For instructions on manually recovering a transaction, see the Recovering a Transaction Participant section.
The process of resolving the transaction outcome manually is dependent on the exact circumstance of the failure. Perform the following steps, as applicable to your environment:
- Identify which resource managers were involved.
- Examine the state of the transaction manager and the resource managers.
- Manually force log cleanup and data reconciliation in one or more of the involved components.
In a test environment, or if you do not care about the integrity of the data, deleting the transaction logs and restarting JBoss EAP gets rid of the heuristic outcome. By default, the transaction logs are located in the
EAP_HOME/standalone/data/tx-object-store/
directory for a standalone server, or theEAP_HOME/domain/servers/SERVER_NAME/data/tx-object-store/
directory in a managed domain. In the case of a managed domain, SERVER_NAME refers to the name of the individual server participating in a server group.NoteThe location of the transaction log also depends on the object store in use and the values set for the
object-store-relative-to
andobject-store-path
parameters. For file system logs, such as a standard shadow and Apache ActiveMQ Artemis logs, the default directory location is used, but when using a JDBC object store, the transaction logs are stored in a database.
5.5.1. Guidelines on Making Decisions for Heuristic Outcomes
Problem Detection
A heuristic decision is one of the most critical errors that can happen in a transaction system. It can lead to parts of the transaction being committed, while other parts are rolled back. Thus, it can violate the atomicity property of the transaction and can possibly lead to corruption of the data integrity.
A recoverable resource maintains all the information about the heuristic decision in stable storage until it is required by the transaction manager. The actual data saved in stable storage depends on the type of recoverable resource and is not standardized. You can parse through the data and possibly edit the resource to correct any data integrity problems.
Heuristic outcomes are stored in the server log and can be identified using the resource manager and transaction manager.
Manually Committing or Rolling Back a Transaction
Generally, you cannot manually commit or rollback a transaction. From the JBoss EAP transaction management perspective, you can move a transaction back to the pending list for automated recovery to try again or delete the record. For example:
You can use the read-resource
operation to check the status of the participants in the transaction:
/subsystem=transactions/log-store=log-store/transactions=0\:ffff7f000001\:-b66efc2\:4f9e6f8f\:9/participants=2:read-resource
The result will look similar to this:
{ "outcome" => "success", "result" => { "eis-product-name" => "ArtemisMQ", "eis-product-version" => "2.0", "jndi-name" => "java:/JmsXA", "status" => "HEURISTIC_HAZARD", "type" => "/StateManager/AbstractRecord/XAResourceRecord" } }
The outcome status shown here is a HEURISTIC_HAZARD
state and is eligible for recovery.
Recovering the HEURISTIC_HAZARD Exception
The following steps show an example of how to recover a hazard
type heuristic outcome.
To begin the recovery, you must consult each resource manager and establish the outcomes of the various branches that are identifiable from the transaction manager tooling. However, you should not need to force a resource manager to commit or rollback. You must rather inspect the resource manager to know the state of the heuristic exception.
The following are reference links for listing and resolving heuristic outcomes for various resource managers:
NoteThese links are for reference purpose only and are subject to change. Please consult the vendor documentation for details.
You must execute the recover operation, as shown in the following example:
/subsystem=transactions/log-store=log-store/transactions=0\:ffff7f000001\:-b66efc2\:4f9e6f8f\:9/participants=2:recover
Running the
recover
operation changes the state of the transaction toPREPARE
and triggers a recovery attempt by replaying thecommit
operation. If the recovery attempt is successful, the participant is removed from the transaction log.You can verify this by running the
probe
operation on thelog-store
element again. The participant should no longer be listed. If this is the last participant, the transaction is also deleted.
Recovering the HEURISTIC_ROLLBACK and HEURISTIC_COMMIT Exceptions
If the heuristic outcome is a rollback
type, then:
- The resource should not be able to commit the transaction, provided the resource manager is well implemented.
- You must decide whether you should delete the branch from the resource manager, using a forget call, so that the rest of the transaction can commit normally and be cleaned from the transaction store.
- If you do not delete the branch from the resource manager, then the transaction will remain in the transaction store forever.
On the other hand, if the heuristic outcome was a commit
type, then you must use the business semantics to deal with the inconsistent outcome.
Further Actions When Manual Reconciliation Fails
You can check the database transaction table, which is the DBA_2PC_PENDING
table for Oracle. However, these will depend upon the specific resource managers. Transaction Manager can provide you with the branches to inspect in each resource manager.
You should consult the vendor’s documentation on this resource manager for details. If you suspect that the problem is caused by the third party resource manager, you must consider raising a support ticket with your supplier.
Revised on 2024-01-17 05:25:55 UTC