XAException.XA_RBCOMMFAIL due to long GC pauses

Solution In Progress - Updated -

Environment

  • Red Hat JBoss Enterprise Application Platform (EAP) 5.2.0
  • Red Hat JBoss SOA Platform 5.2.0
  • long time for Garbage Collection (GC)
  • updated remoting.jar
  • added MessagingClusterHealthMBean configuration

Issue

  • multiple XA_RBCOMMFAIL exceptions in server.log
  • losing messages (orphan channel ids due to new post office entries)
  • server.log warning similar to
WARN  [com.arjuna.ats.jta.logging.loggerI18N] (<thread>) [com.arjuna.ats.internal.jta.resources.arjunacore.preparefailed] [com.arjuna.ats.internal.jta.resources.arjunacore.preparefailed] XAResourceRecord.prepare - prepare failed with exception XAException.XA_RBCOMMFAIL
WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N] (<thread>) [com.arjuna.ats.arjuna.coordinator.BasicAction_36] - BasicAction.End() - prepare phase of action-id <tx> failed.
WARN  [com.arjuna.ats.arjuna.logging.arjLoggerI18N] (<thread>) [com.arjuna.ats.arjuna.coordinator.BasicAction_38] - Action Aborting
WARN  [org.jboss.soa.esb.listeners.message.MessageAwareListener] (<thread>) TransactionalRunner caught transaction exception:
org.jboss.soa.esb.common.TransactionStrategyException: Failed to terminate transaction on current thread
        at org.jboss.soa.esb.common.JBossESBTransactionService$JTATransactionStrategy.terminate(JBossESBTransactionService.java:119)
        at org.jboss.soa.esb.listeners.message.MessageAwareListener$TransactionalRunner.run(MessageAwareListener.java:554)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: javax.transaction.RollbackException: [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] [com.arjuna.ats.internal.jta.transaction.arjunacore.commitwhenaborted] Could not commit transaction.        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1443)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:137)
        at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
        at org.jboss.soa.esb.common.JBossESBTransactionService$JTATransactionStrategy.terminate(JBossESBTransactionService.java:101)

Diagnostic Steps


2013-09-30T11:54:24+0000 297520.642: [Full GC [PSYoungGen: 514402K->0K(516928K)] [ParOldGen: 1398141K->1229174K(1398144K)] 1912543K->1229174K(1915072K) [PSPermGen: 207077K->206811K(393216K)], 67.3812120 secs] [Times: user=10.30 sys=1.69, real=67.37 secs]
2013-09-30T11:54:24+0000 Heap after GC invocations=2741 (full 13):
:
:
2013-09-30T11:54:24+0000 Total time for which application threads were stopped: 68.9951350 seconds
2013-09-30T11:54:24+0000 ** long GC: gc_pause=68.9951350 gc_pause_threshold=5 uptime=297589 uptime_human="3d 10:39:49"


the below from logs indicates

[Times: user=10.30 sys=1.69, real=67.37 secs]

that there has been some kind of starvation of cpu cycles , made available to run the java processes . Something in the OS level has prevented the JVM from getting the CPU time.

And there has been only one occurrence of it. Little unsure what could have caused it at this stage.
Other suspects would be when you have VM's running on your OS , there could be chances of big memory swappings etc.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments