Using Batch Processing (JBeret) with a Clustering of nodes sharing the same Job Repository in JBoss EAP 8

Updated -

The Batch Processing specification JSR-325 does not define cluster awareness, but clustered EAP nodes can be configured to use the same Job Repository.

Configure the batch-jberet subsystem to use a jdbc repository

This example uses MyDatasource which would be a datasource defined in the datasources subsystem.
These commands will configure the batch-jberet subsystem to use a datasource for the job repository instead of the default in-memory repository.

/subsystem=batch-jberet/jdbc-job-repository=jdbc:add(data-source=MyDatasource)
/subsystem=batch-jberet:write-attribute(name=default-job-repository,value=jdbc)

The commands result in this configuration:

        <subsystem xmlns="urn:jboss:domain:batch-jberet:3.0">
            <default-job-repository name="jdbc"/>
            ...
            <job-repository name="jdbc">
                <jdbc data-source="MyDatasource"/>
            </job-repository>
            ...
        </subsystem>

Restarting a Job

To avoid restarting a job execution that is currently running in another node, set jberet.restart.mode job parameters to strict when restarting:

final Properties properties = new Properties();
properties.setProperty("jberet.restart.mode", "strict");
final long restartId = jobOperator.restart(oldId, properties);

JBeret has a feature that supports restarting a previous job execution that was abruptly terminated and stuck in status like STARTED. This can be caused by a jvm or system crash. When restarting a job execution with STARTED stauts, jberet still checks if it is running. So this problem of duplicate executions caused by restarting will not happen when restarting in the same node. But when restarting such a job execution in another node, the conditions resembles restarting after a crash, so jberet mistakenly honors user's instruction to restart.

To restart from CLI:

/deployment=batch-example.war/subsystem=batch-jberet:restart-job(execution-id=57)

Restart specifying jberet.restart.mode=strict :

/deployment=batch-example.war/subsystem=batch-jberet:restart-job(execution-id=57, properties={jberet.restart.mode=strict})

Start a job

/deployment=batch-example.war/subsystem=batch-jberet:start-job(job-xml-name=numbers)

Stop a job

/deployment=batch-example.war/subsystem=batch-jberet:stop-job(execution-id=1)

Check the status of a job

/deployment=batch-example.war/subsystem=batch-jberet/job=numbers/execution=1:read-resource(include-runtime, recursive)
{
    "outcome" => "success",
    "result" => {
        "batch-status" => "STOPPED",
        "create-time" => "2020-10-29T19:33:13.843-0400",
        "end-time" => "2020-10-29T19:33:30.258-0400",
        "exit-status" => "STOPPED",
        "instance-id" => 1L,
        "last-updated-time" => "2020-10-29T19:33:30.258-0400",
        "start-time" => "2020-10-29T19:33:13.853-0400"
    }
}

Notes

  • The CLI commands use batch-example.war which is an example, when running the commands this would be replaced with the application name that contains your Batch Processing annotated classes.

Comments