Chapter 2. The Cache API

2.1. The Cache API

The Cache interface provides simple methods for the addition, retrieval and removal of entries, which includes atomic mechanisms exposed by the JDK’s ConcurrentMap interface. How entries are stored depends on the cache mode in use. For example, an entry may be replicated to a remote node or an entry may be looked up in a cache store.

The Cache API is used in the same manner as the JDK Map API for basic tasks. This simplifies the process of migrating from Map-based, simple in-memory caches to Red Hat JBoss Data Grid’s cache.

Note

This API is not available in JBoss Data Grid’s Remote Client-Server Mode

2.2. Using the ConfigurationBuilder API to Configure the Cache API

Red Hat JBoss Data Grid uses a ConfigurationBuilder API to configure caches.

Caches are configured programmatically using the ConfigurationBuilder helper object.

The following is an example of a synchronously replicated cache configured programmatically using the ConfigurationBuilder API:

Programmatic Cache Configuration

Configuration c = new ConfigurationBuilder().clustering().cacheMode(CacheMode.REPL_SYNC).build();

String newCacheName = "repl";
manager.defineConfiguration(newCacheName, c);
Cache<String, String> cache = manager.getCache(newCacheName);

  1. In the first line of the configuration, a new cache configuration object (named c) is created using the ConfigurationBuilder . Configuration c is assigned the default values for all cache configuration options except the cache mode, which is overridden and set to synchronous replication (REPL_SYNC).
  2. In the second line of the configuration, a new variable (of type String) is created and assigned the value repl.
  3. In the third line of the configuration, the cache manager is used to define a named cache configuration for itself. This named cache configuration is called repl and its configuration is based on the configuration provided for cache configuration c in the first line.
  4. In the fourth line of the configuration, the cache manager is used to obtain a reference to the unique instance of the repl that is held by the cache manager. This cache instance is now ready to be used to perform operations to store and retrieve data.
Note

JBoss EAP includes its own underlying JMX. This can cause a collision when using the sample code with JBoss EAP and display an error such as org.infinispan.jmx.JmxDomainConflictException: Domain already registered org.infinispan.

To avoid this, configure global configuration as follows:

GlobalConfiguration glob = new GlobalConfigurationBuilder()
	.clusteredDefault()
        .globalJmxStatistics()
          .allowDuplicateDomains(true)
          .enable()
        .build();

2.3. Per-Invocation Flags

2.3.1. Per-Invocation Flags

Per-invocation flags can be used with caches in Red Hat JBoss Data Grid to specify behavior for each cache call. Per-invocation flags facilitate the implementation of potentially time saving optimizations.

2.3.2. Per-Invocation Flag Functions

The putForExternalRead() method in Red Hat JBoss Data Grid’s Cache API uses flags internally. This method can load a JBoss Data Grid cache with data loaded from an external resource. To improve the efficiency of this call, JBoss Data Grid calls a normal put operation passing the following flags:

  • The ZERO_LOCK_ACQUISITION_TIMEOUT flag: JBoss Data Grid uses an almost zero lock acquisition time when loading data from an external source into a cache.
  • The FAIL_SILENTLY flag: If the locks cannot be acquired, JBoss Data Grid fails silently without throwing any lock acquisition exceptions.
  • The FORCE_ASYNCHRONOUS flag: If clustered, the cache replicates asynchronously, irrespective of the cache mode set. As a result, a response from other nodes is not required.

Combining the flags above significantly increases the efficiency of the operation. The basis for this efficiency is that putForExternalRead calls of this type are used because the client can retrieve the required data from a persistent store if the data cannot be found in memory. If the client encounters a cache miss, it retries the operation.

A detailed list of all flags available for JBoss Data Grid is in the JBoss Data Grid API Documentation’s Flag class.

2.3.3. Configure Per-Invocation Flags

To use per-invocation flags in Red Hat JBoss Data Grid, add the required flags to the advanced cache via the withFlags() method call.

Configuring Per-Invocation Flags

Cache cache = ...
	cache.getAdvancedCache()
	   .withFlags(Flag.SKIP_CACHE_STORE, Flag.CACHE_MODE_LOCAL)
	   .put("local", "only");

Note

The called flags only remain active for the duration of the cache operation. To use the same flags in multiple invocations within the same transaction, use the withFlags() method for each invocation. If the cache operation must be replicated onto another node, the flags are also carried over to the remote nodes.

2.3.4. Per-Invocation Flags Example

In a use case for Red Hat JBoss Data Grid, where a write operation, such as put(), must not return the previous value, the IGNORE_RETURN_VALUES flag is used. This flag prevents a remote lookup (to get the previous value) in a distributed environment, which in turn prevents the retrieval of the undesired, potential, previous value. Additionally, if the cache is configured with a cache loader, this flag prevents the previous value from being loaded from its cache store.

Using the IGNORE_RETURN_VALUES Flag

Cache cache = ...
	cache.getAdvancedCache()
	   .withFlags(Flag.IGNORE_RETURN_VALUES)
	   .put("local", "only")

2.4. The AdvancedCache Interface

2.4.1. The AdvancedCache Interface

Red Hat JBoss Data Grid offers an AdvancedCache interface, geared towards extending JBoss Data Grid, in addition to its simple Cache Interface. The AdvancedCache Interface can:

  • Inject custom interceptors
  • Access certain internal components
  • Apply flags to alter the behavior of certain cache methods

The following code snippet presents an example of how to obtain an AdvancedCache:

AdvancedCache advancedCache = cache.getAdvancedCache();

2.4.2. Flag Usage with the AdvancedCache Interface

Flags, when applied to certain cache methods in Red Hat JBoss Data Grid, alter the behavior of the target method. Use AdvancedCache.withFlags() to apply any number of flags to a cache invocation.

Applying Flags to a Cache Invocation

advancedCache.withFlags(Flag.CACHE_MODE_LOCAL, Flag.SKIP_LOCKING)
   .withFlags(Flag.FORCE_SYNCHRONOUS)
   .put("hello", "world");

2.4.3. GET and PUT Usage in Distribution Mode

2.4.3.1. GET and PUT Usage in Distribution Mode

In distribution mode, the cache performs a remote GET command before a write command. This occurs because certain methods (for example, Cache.put()) return the previous value associated with the specified key according to the java.util.Map contract. When this is performed on an instance that does not own the key and the entry is not found in the L1 cache, the only reliable way to elicit this return value is to perform a remote GET before the PUT.

The GET operation that occurs before the PUT operation is always synchronous, whether the cache is synchronous or asynchronous, because Red Hat JBoss Data Grid must wait for the return value.

2.4.3.2. Distributed GET and PUT Operation Resource Usage

In distribution mode, the cache may execute a GET operation before executing the desired PUT operation.

This operation is very expensive in terms of resources. Despite operating in an synchronous manner, a remote GET operation does not wait for all responses, which would result in wasted resources. The GET process accepts the first valid response received, which allows its performance to be unrelated to cluster size.

Use the Flag.SKIP_REMOTE_LOOKUP flag for a per-invocation setting if return values are not required for your implementation.

Such actions do not impair cache operations and the accurate functioning of all public methods, but do break the java.util.Map interface contract. The contract breaks because unreliable and inaccurate return values are provided to certain methods. As a result, ensure that these return values are not used for any important purpose on your configuration.

2.4.4. Limitations of Map Methods

Specific Map methods, such as size(), values(), keySet() and entrySet(), can be used with certain limitations with Red Hat JBoss Data Grid as they are unreliable. These methods do not acquire locks (global or local) and concurrent modification, additions and removals are excluded from consideration in these calls.

The listed methods have a significant impact on performance. As a result, it is recommended that these methods are used for informational and debugging purposes only.

Performance Concerns

In JBoss Data Grid 7.1 the map methods size(), values(), keySet(), and entrySet() include entries in the cache loader by default. The cache loader in use will determine the performance of these commands; for instance, when using a database these methods will run a complete scan of the table where data is stored, which may result in slower processing. To not load entries from the cache loader, and avoid any potential performance hit, use Cache.getAdvancedCache().withFlags(Flag.SKIP_CACHE_LOAD) before executing the desired method.

Understanding the size() Method (Embedded Caches)

In JBoss Data Grid 7.1 the Cache.size() method provides a count of all elements in both this cache and cache loader across the entire cluster. When using a loader or remote entries, only a subset of entries is held in memory at any given time to prevent possible memory issues, and the loading of all entries may be slow.

In this mode of operation, the result returned by the size() method is affected by the flags org.infinispan.context.Flag#CACHE_MODE_LOCAL, to force it to return the number of entries present on the local node, and org.infinispan.context.Flag#SKIP_CACHE_LOAD, to ignore any passivated entries. Either of these flags may be used to increase performance of this method, at the cost of not returning a count of all elements across the entire cluster.

Understanding the size() Method (Remote Caches)

In JBoss Data Grid 7.1 the Hot Rod protocol contain a dedicated SIZE operation, and the clients use this operation to calculate the size of all entries.