Chapter 6. Set Up Replication Mode

6.1. About Replication Mode

Red Hat JBoss Data Grid’s replication mode is a simple clustered mode. Cache instances automatically discover neighboring instances on other Java Virtual Machines (JVM) on the same network and subsequently form a cluster with the discovered instances. Any entry added to a cache instance is replicated across all cache instances in the cluster and can be retrieved locally from any cluster cache instance.

In JBoss Data Grid’s replication mode, return values are locally available before the replication occurs.

6.2. Optimized Replication Mode Usage

Replication mode is used for state sharing across a cluster; however, if you have a replicated cache and a large number of nodes are in use then there will be many writes to the replicated cache to keep all of the nodes synchronized. The amount of work performed will depend on many factors and on the specific use case, and for this reason it is recommended to ensure that each workload is tested thoroughly to determine if replication mode will be beneficial with the number of planned nodes. For many situations replication mode is not recommended once there are ten servers; however, in some workloads, such as if load read is important, this mode may be beneficial.

Red Hat JBoss Data Grid can be configured to use UDP multicast, which improves performance to a limited degree for larger clusters.

6.3. Configure Replication Mode

Replication mode is a clustered cache mode in Red Hat JBoss Data Grid. Replication mode can be added to any cache container, in both Library Mode and Remote Client-Server Mode, using the following procedure.

The replicated-cache Element

<cache-container name="clustered"
  <!-- Additional configuration information here -->
  <replicated-cache name="default"
    <!-- Additional configuration information here -->


JGroups must be appropriately configured for clustered mode before attempting to load this configuration.

The replicated-cache element configures settings for the distributed cache using the following parameters:

  1. The name parameter provides a unique identifier for the cache.
  2. If statistics are enabled at the container level, per-cache statistics can be selectively disabled for caches that do not require monitoring by setting the statistics attribute to false.

For details about the cache-container and locking, see the appropriate chapter.

6.4. Synchronous and Asynchronous Replication

6.4.1. Synchronous and Asynchronous Replication

Replication mode can be synchronous or asynchronous depending on the problem being addressed.

  • Synchronous replication blocks a thread or caller (for example on a put() operation) until the modifications are replicated across all nodes in the cluster. By waiting for acknowledgments, synchronous replication ensures that all replications are successfully applied before the operation is concluded.
  • Asynchronous replication operates significantly faster than synchronous replication because it does not need to wait for responses from nodes. Asynchronous replication performs the replication in the background and the call returns immediately. Errors that occur during asynchronous replication are written to a log. As a result, a transaction can be successfully completed despite the fact that replication of the transaction may not have succeeded on all the cache instances in the cluster.

6.4.2. Troubleshooting Asynchronous Replication Behavior

In some instances, a cache configured for asynchronous replication or distribution may wait for responses, which is synchronous behavior. This occurs because caches behave synchronously when both state transfers and asynchronous modes are configured. This synchronous behavior is a prerequisite for state transfer to operate as expected.

Use one of the following to remedy this problem:

  • Disable state transfer and use a ClusteredCacheLoader to lazily look up remote state as and when needed.
  • Enable state transfer and REPL_SYNC. Use the Asynchronous API (for example, the cache.putAsync(k, v)) to activate 'fire-and-forget' capabilities.
  • Enable state transfer and REPL_ASYNC. All RPCs end up becoming synchronous, but client threads will not be held up if a replication queue is enabled (which is recommended for asynchronous mode).

6.5. The Replication Queue

6.5.1. The Replication Queue

In replication mode, Red Hat JBoss Data Grid uses a replication queue to replicate changes across nodes based on the following:

  • Previously set intervals.
  • The queue size exceeding the number of elements.
  • A combination of previously set intervals and the queue size exceeding the number of elements.

The replication queue ensures that during replication, cache operations are transmitted in batches instead of individually. As a result, a lower number of replication messages are transmitted and fewer envelopes are used, resulting in improved JBoss Data Grid performance.

A disadvantage of using the replication queue is that the queue is periodically flushed based on the time or the queue size. Such flushing operations delay the realization of replication, distribution, or invalidation operations across cluster nodes. When the replication queue is disabled, the data is directly transmitted and therefore the data arrives at the cluster nodes faster.

A replication queue is used in conjunction with asynchronous mode.

6.5.2. Replication Queue Usage

When using the replication queue, do one of the following:

  • Disable asynchronous marshalling.
  • Set the max-threads count value to 1 for the executor attribute of the transport element. The executor is only available in Library Mode, and is therefore defined in its configuration file as follows:

    <transport executor="infinispan-transport"/>

To implement either of these solutions, the replication queue must be in use in asynchronous mode. Asynchronous mode can be set by defining mode="ASYNC", as seen in the following example:

Replication Queue in Asynchronous Mode

<replicated-cache name="asyncCache"
              <!-- Additional configuration information here -->

The replication queue allows requests to return to the client faster, therefore using the replication queue together with asynchronous marshalling does not present any significant advantages.

6.6. About Replication Guarantees

In a clustered cache, the user can receive synchronous replication guarantees as well as the parallelism associated with asynchronous replication. Red Hat JBoss Data Grid provides an asynchronous API for this purpose.

The asynchronous methods used in the API return Futures, which can be queried. The queries block the thread until a confirmation is received about the success of any network calls used.

6.7. Replication Traffic on Internal Networks

Some cloud providers charge less for traffic over internal IP addresses than for traffic over public IP addresses, or do not charge at all for internal network traffic (for example, ). To take advantage of lower rates, you can configure Red Hat JBoss Data Grid to transfer replication traffic using the internal network. With such a configuration, it is difficult to know the internal IP address you are assigned. JBoss Data Grid uses JGroups interfaces to solve this problem.