Automatically Taking Sites Offline with Asynchronous Cross-Site Replication in JDG cluster

  • Red Hat JBoss Data Grid (JDG)
    • 7.3.2.+


  • How to know if the site is taking offline automatically when configuring after-failures parameter?
  • How to know if the site is taking offline automatically when configuring min-wait parameter?


Data Grid applies the take-offline configuration when using Cross-Site replication capabilities.
The following configuration provides an example to take sites offline automatically after 20 seconds:

  <backup site="site01" strategy="ASYNC">
    <take-offline after-failures="-1" min-wait="20000"/>
  • after-failures - the number of failed backup operations after which this site should be taken offline. Defaults to 0 (never). A negative value would mean that the site will be taken offline after minTimeToWait

  • min-wait - the number of milliseconds in which a site is not marked offline even if it is unreachable for after-failures number of times. If smaller or equal to 0, then only after-failures is considered.

NOTE: Automatically taking sites offline with strategy="ASYNC" is only available to JDG 7.3.2 upper, minor releases only apply strategy="SYNC".

Diagnostic Steps

Enable TRACE level log messages for classorg.infinispan.xsite.OfflineStatus.
When using after-failures parameter search for min failures reached in the server.log file as follows:

2019-07-12 16:08:09,654 TRACE [org.infinispan.xsite.OfflineStatus] (jgroups-45,jdg-d-cachesrv-01) Site is failed: min failures reached.
2019-07-12 16:08:09,654 INFO  [org.infinispan.CLUSTER] (jgroups-45,jdg-d-cachesrv-01) [Context=api-general-filestore][Context=jdg-d-cachesrv-01]ISPN100006: Site 'site02' is offline.

If setting up min-wait parameter search for The minTimeToWait has passed in the server.log file as follows:

2019-07-15 15:11:36,371 TRACE [org.infinispan.xsite.OfflineStatus] (HotRod-ServerHandler-7-56) The minTimeToWait has passed: minTime=20000, timeSinceFirstFailure=38378

Infinispan only updates the site status when it needs to replicate data (put operation) to the backup site.

