Process instance execution duplicated using EJB timers on multiple nodes
Issue
We are using RHPAM 7.x on multiple kie-server nodes. All nodes are configured to use the same database, with EJB timers backed by a DB store. The partition-name of the EJB timer DB store is the same on all nodes, to ensure timers from one node fail-over to another node if required.
Intermittently, we have noticed that the same timer is firing on multiple nodes. Depending on the timing, this can lead to the following:
-
The task modeled after the timer is executed multiple times. This is an issue for tasks that should only be executed once for a specific process instance.
-
The following ERROR is logged on the node that is executing the timer after it has already been executed on a different node:
10:17:53,765 ERROR [stderr] (EJB default - 9) java.lang.RuntimeException: org.kie.internal.runtime.manager.SessionNotFoundException: No session found for context 442
10:17:53,765 ERROR [stderr] (EJB default - 9) at deployment.kie-server.war//org.jbpm.process.core.timer.impl.GlobalTimerService.getRunner(GlobalTimerService.java:293)
10:17:53,765 ERROR [stderr] (EJB default - 9) at deployment.kie-server.war//org.jbpm.process.core.timer.impl.GlobalTimerService.getRunner(GlobalTimerService.java:254)
10:17:53,765 ERROR [stderr] (EJB default - 9) at deployment.kie-server.war//org.jbpm.persistence.timer.GlobalJpaTimerJobInstance.call(GlobalJpaTimerJobInstance.java:79)
10:17:53,765 ERROR [stderr] (EJB default - 9) at deployment.kie-server.war//org.jbpm.persistence.timer.GlobalJpaTimerJobInstance.call(GlobalJpaTimerJobInstance.java:48)
10:17:53,765 ERROR [stderr] (EJB default - 9) at deployment.kie-server.war//org.jbpm.services.ejb.timer.EJBTimerScheduler.executeTimerJobInstance(EJBTimerScheduler.java:128)
10:17:53,765 ERROR [stderr] (EJB default - 9) at deployment.kie-server.war//org.jbpm.services.ejb.timer.EJBTimerScheduler.transaction(EJBTimerScheduler.java:182)
10:17:53,765 ERROR [stderr] (EJB default - 9) at deployment.kie-server.war//org.jbpm.services.ejb.timer.EJBTimerScheduler.executeTimerJob(EJBTimerScheduler.java:120)
While the ERROR is seen frequently, we only see the duplicate process execution sporadically.
What needs to be done to ensure a timer instance is only fired on one particular node in a multi-node kie-server environment?
Environment
- Red Hat Process Automation Manager (RHPAM)
- 7.9.0.GA
- 7.12.1.GA
- EAP 7.3.2 (or later)
- EJB timers with DB store
- multi-node setup
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.