Synchronous vs. Asynchronous configuration for RHDG cluster
Environment
- Red Hat Data Grid (RHDG)
Issue
- Clustering with
<async>
mode is fast. Why isn't it the default? - What's wrong with async replication? Isn't it faster?
- When ASYNC caches should be used?
- Does SYNC mode ensure consistency?
- How to decide whether to use SYNC or ASYNC mode for caches?
Resolution
In general the decision is to prefer Consistency or Performance (see CAP theorem).
It depends on the use case and requirements which mode will be the best option.
The decision can be made fine-graned per cache for DataGrid cluster configurations.
Cache mode=SYNC
The put operation will wait until all owners (primary and backup) of this entry are returning with successful applied the change. This operation might take longer depending on the number of owners.
Consider a replicated
cache where all cluster nodes need to confirm this!
If the put operation returns it is ensured that all members of the custer are up-to-date.
Cache mode=ASYNC
The put operation will wait until the primary owner of this entry successfully applied the change. Updates to all backups is queued and will be done later in a different thread.
If the put operation returns the primary owner is updated, and the backup updates are pending.
In this case only a smart HotRod client will return the correct value immediately as it will reach out to the primary owner.
In case of embedded mode or i.e. REST clients this could lead to a stale read directly after the put as the get could be routed to a backup owner.
As well in worse case the primary owner is not able to apply one or more changes to the backup owners as it might fail for some reason. In this case the cluster members are out of sync until the next put is successful. This could lead to loose the update in case of rebalancing.
CacheStore with write-through
(default)
Nodes which are owner of an entry will return only after the cache-store successfully written the entry.
If it is a shared store this will happen only on the primary owner. If not shared every backup will write the entry to its persistence.
CacheStore with write-behind
The node will queue all writes to the store and return immediately. With modification-queue-size
the nubmer of entries in the queue can be controlled. If the queue-size is exeeded the store will fall back to write-through
.
In case of failures with the store the entry is not written and a WARN message will be logged.
This can lead in lost entries or inconsistencies in case of crashed and restarts.
Root Cause
There are two different modes for distributed
and replicated
caches, these apply to the communication between the cluster nodes when an entry is added or changed.
SYNC
mode is to prefer consistency and ASYNC
prefer performance.
Similar can be set to the persistence for cache-stores. By default the persistence is synchronous, the configuration to switch to asyncronous is write-behind
.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments