How to keep the database tables from growing infinitely in RHPAM/BPM Suite?
Environment
- Red Hat JBoss BRMS (BRMS)
- 5.3.x
- Red Hat JBoss BPM Suite (BPMS)
- 6.x
- Red Hat Process Automation Manager (RHPAM)
- 7.x
Issue
- We are developing an application which creates a very high number of process instances. We need to ensure that the tables where jBPM 5 stores the runtime information are not getting filled by information pertaining to ended process instances. How can this be done?
- Are there any database clean-up tasks available in jBPM 5?
- How to enable history logging?
- What is the significance of
SessionInfo
table of jBPM5 schema? - Would like to clean up
SessionInfo
table at regular intervals of time. Is it advisable and would it have any impact? - Is there any API available to clean up the records from tables like
TASK, WORKITEMINFO, PROCESSINSTANCEINFO, PROCESSINSTANCELOG, VARIABLEINSTANCELOG, NODEINSTANCELOG
? If not, then what strategies could be taken to clean up the unwanted/old records manually from these tables?
Resolution
RHPAM 7 / BPM Suite 6
Use LogCleanupCommand. Red Hat BPM Suite - Administration and Configuration Guide - 19.4. Managing log files documentation describes what is needed to prevent unlimited table growth in BPM Suite 6. Note that record deletion by manual SQL is not supported.
BRMS 5.3
-
For the binary persistence of the process instance execution model, the following approaches need to be followed to prevent table growth:
- Do not enable db logger to persist process instance events
- Enable db logger, but clear the log tables periodically to keep them from growing too large. To clean the log tables, the
JPAProcessInstanceDbLog
class offers the methodclear()
, otherwise the data could be purged from the database directly.
-
For the human task related tables, the following approaches could be followed to prevent table growth:
- Create a listener similar to TaskCleanUpProcessEventListener1 - due to signed jars which are shipped with the platform, use a different package name. This listener will be included in an upcoming BRMS version.
- Use the
TasksAdmin
2 API to periodically remove ended tasks, by calling it'sremoveTasks
method. -
Delete the records directly in the database - the
TASK
table has a status field which could be used to query for tasks which could be removed. It is needed to create a script that takes care of properly handling the entries in the referenced tables.-
SessionInfo
table contains all the persistent sessions. As soon as a persistent-knowledge session is created using theJPAKnowledgeService
helper class, a new row is inserted inside thisSessionInfo
table. Please note that this table not only contains the sessions having active process instances but all the sessions which have been created over time (unless records have not been explicitly deleted). -
Records from the
SessionInfo
table can be manually removed from the database itself since there is no API for doing that. But keep in mind couple of factors before doing that.- As per the business need people may want to keep sessions with a
STARTDATE
column value which is less than[(current date) - x days]
. Unless they havetimers, rules data, globals
etc. More details could be found in Kris's comment inJBPM-3582
JIRA3. - The other factor to be kept under consideration could be, when are the
ksessions
being reloaded from the database. If it is also done from within the code, then obviously, whenksessions
are being reloaded needs to be taken into account.
- As per the business need people may want to keep sessions with a
-
-
There are no direct API available for removing old records from most of the following tables. However, here are a few of the strategies which can be considered (still it might differ as per business need) while removing records from some of the jBPM 5/Human Task related tables:
1. TASK table
When a process instance is aborted the tasks are put into the
Exited
status but they are not cleaned up in the DB. There have also been cases where tasks withstatus = "Completed"
are orphaned (= process instance of TASK.PROCESSINSTANCEID finished and outdated like LOG tables below). For these scenarios jBPM 5.2 contains aorg.jbpm.task.admin.TaskAdmin
class [2][3] that has apublic int removeTasks(List<TaskSummary> tasks);
method which can delete tasks. This is the easiest way to go about deleting tasks.2. WORKITEMINFO table
There aren't any direct API available for removing these records but users can clean up the records using this approach manually:
When CREATIONDATE column value < (current date - x days) and the PROCESSINSTANCEID value does not exist in the PROCESSINSTANCEINFO table
Please note the relationship to the process instance is a determining factor here.
3. PROCESSINSTANCEINFO table
Again for this as well there aren't any direct API but manually deleting records can be done. However, if there process instances
in limbo
that havestate 1 (active)
, then before manually removing those records users need to determine whether or not records that should be deleted is the age (perhaps looking at both thelastmodificationdate
and thelastreaddate
).4. PROCESSINSTANCELOG table
No direct API available but manually it can done by the following strategy
When END_DATE column value is < (current date - x days)
5. VARIABLEINSTANCELOG table
No direct API available but manually it can done by the following strategy
When LOG_DATE column value is < (current date - x days) and the PROCESSINSTANCEID value does not exist in the PROCESSINSTANCELOG table
6. NODEINSTANCELOG table
No direct API available but manually it can done by the following strategy.
When LOG_DATE column value is < (current date - x days) and the PROCESSINSTANCEID value does not exist in the PROCESSINSTANCELOG table
Root Cause
-
By default, jBPM 5 only keeps records about active process instances in its tables. For instance, after starting a process instance, a new record will be inserted into the
ProcessInstanceInfo
table. After ending the process instance, the record in theProcessInstanceInfo
table will be deleted again. -
Only if you have instructed jBPM 5 to create a database history log, by registering
JPAWorkingMemoryDbLogger
, then additional records are created during process execution in theProcessInstanceLog
,NodeInstanceLog
andVariableInstanceLog
tables:StatefulKnowledgeSession ksession = ...; JPAWorkingMemoryDbLogger logger = new JPAWorkingMemoryDbLogger(ksession); // invoke methods one your session here logger.dispose();
Note: These log records are not cleaned automatically when the process instance ends.
-
Refers to the documentation links provided to get information about the DB schema, both for the binary persistence of the process instance execution model1, as well as the history log2.
-
While it is true that jBPM 5 does not keep information about ended process instances, except for the log tables, this is not the case for the human task component. Therefore, if there are processes which use human tasks, the human task related tables will grow in size. For an overview of these tables, please refer to the newly added chapter in the jBPM 5.3 User Guide3.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments