Synchronous vs. Asynchronous configuration for RHDG cluster

Solution Verified - Updated -

Environment

  • Red Hat Data Grid (RHDG)

Issue

  • Clustering with <async> mode is fast. Why isn't it the default?
  • What's wrong with async replication? Isn't it faster?
  • When ASYNC caches should be used?
  • Does SYNC mode ensure consistency?
  • How to decide whether to use SYNC or ASYNC mode for caches?

Resolution

In general the decision is to prefer Consistency or Performance (see CAP theorem).
It depends on the use case and requirements which mode will be the best option.

The decision can be made fine-graned per cache for DataGrid cluster configurations.

Cache mode=SYNC

The put operation will wait until all owners (primary and backup) of this entry are returning with successful applied the change. This operation might take longer depending on the number of owners.
Consider a replicated cache where all cluster nodes need to confirm this!

If the put operation returns it is ensured that all members of the custer are up-to-date.

Cache mode=ASYNC

The put operation will wait until the primary owner of this entry successfully applied the change. Updates to all backups is queued and will be done later in a different thread.

If the put operation returns the primary owner is updated, and the backup updates are pending.
In this case only a smart HotRod client will return the correct value immediately as it will reach out to the primary owner.
In case of embedded mode or i.e. REST clients this could lead to a stale read directly after the put as the get could be routed to a backup owner.
As well in worse case the primary owner is not able to apply one or more changes to the backup owners as it might fail for some reason. In this case the cluster members are out of sync until the next put is successful. This could lead to loose the update in case of rebalancing.

CacheStore with write-through (default)

Nodes which are owner of an entry will return only after the cache-store successfully written the entry.
If it is a shared store this will happen only on the primary owner. If not shared every backup will write the entry to its persistence.

CacheStore with write-behind

The node will queue all writes to the store and return immediately. With modification-queue-size the nubmer of entries in the queue can be controlled. If the queue-size is exeeded the store will fall back to write-through.

In case of failures with the store the entry is not written and a WARN message will be logged.
This can lead in lost entries or inconsistencies in case of crashed and restarts.

Root Cause

There are two different modes for distributedand replicated caches, these apply to the communication between the cluster nodes when an entry is added or changed.
SYNC mode is to prefer consistency and ASYNC prefer performance.

Similar can be set to the persistence for cache-stores. By default the persistence is synchronous, the configuration to switch to asyncronous is write-behind.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.