Menu Close

Chapter 1. Data Grid Caches

Data Grid caches provide flexible, in-memory data stores that you can configure to suit use cases such as:

  • boosting application performance with high-speed local caches.
  • optimizing databases by decreasing the volume of write operations.
  • providing resiliency and durability for consistent data across clusters.

1.1. Cache Interface

Cache<K,V> is the central interface for Data Grid and extends java.util.concurrent.ConcurrentMap.

Cache entries are highly concurrent data structures in key:value format that support a wide and configurable range of data types, from simple strings to much more complex objects.

1.2. Cache Managers

Data Grid provides a CacheManager interface that lets you create, modify, and manage local or clustered caches. Cache Managers are the starting point for using Data Grid caches.

There are two CacheManager implementations:

EmbeddedCacheManager
Entry point for caches when running Data Grid inside the same Java Virtual Machine (JVM) as the client application, which is also known as Library Mode.
RemoteCacheManager
Entry point for caches when running Data Grid as a remote server in its own JVM. When it starts running, RemoteCacheManager establishes a persistent TCP connection to a Hot Rod endpoint on a Data Grid server.
Note

Both embedded and remote CacheManager implementations share some methods and properties. However, semantic differences do exist between EmbeddedCacheManager and RemoteCacheManager.

1.3. Cache Containers

Cache containers declare one or more local or clustered caches that a Cache Manager controls.

Cache container declaration

<cache-container name="clustered" default-cache="default">
  <!-- Cache Manager configuration goes here. -->
</cache-container>

1.4. Cache Modes

Tip

Data Grid Cache Managers can create and control multiple caches that use different modes. For example, you can use the same Cache Manager for local caches, distributes caches, and caches with invalidation mode.

Local Caches
Data Grid runs as a single node and never replicates read or write operations on cache entries.
Clustered Caches
Data Grid instances running on the same network can automatically discover each other and form clusters to handle cache operations.
Invalidation Mode
Rather than replicating cache entries across the cluster, Data Grid evicts stale data from all nodes whenever operations modify entries in the cache. Data Grid performs local read operations only.
Replicated Caches
Data Grid replicates each cache entry on all nodes and performs local read operations only.
Distributed Caches
Data Grid stores cache entries across a subset of nodes and assigns entries to fixed owner nodes. Data Grid requests read operations from owner nodes to ensure it returns the correct value.
Scattered Caches
Data Grid stores cache entries across a subset of nodes. By default Data Grid assigns a primary owner and a backup owner to each cache entry in scattered caches. Data Grid assigns primary owners in the same way as with distributed caches, while backup owners are always the nodes that initiate the write operations. Data Grid requests read operations from at least one owner node to ensure it returns the correct value.

1.4.1. Cache Mode Comparison

The cache mode that you should choose depends on the qualities and guarantees you need for your data.

The following table summarizes the primary differences between cache modes:

Cache modeClustered?Read performanceWrite performanceCapacityAvailabilityCapabilities

Local

No

High (local)

High (local)

Single node

Single node

Complete

Simple

No

Highest (local)

Highest (local)

Single node

Single node

Partial: no transactions, persistence, or indexing.

Invalidation

Yes

High (local)

Low (all nodes, no data)

Single node

Single node

Partial: no indexing.

Replicated

Yes

High (local)

Lowest (all nodes)

Smallest node

All nodes

Complete

Distributed

Yes

Medium (owners)

Medium (owner nodes)

Sum of all nodes capacity divided by the number of owners.

Owner nodes

Complete

Scattered

Yes

Medium (primary)

Higher (single RPC)

Sum of all nodes capacity divided by 2.

Owner nodes

Partial: no transactions.