Chapter 20. Cache Stores

20.1. Cache Stores

The cache store connects Red Hat JBoss Data Grid to the persistent data store. Cache stores are associated with individual caches. Different caches attached to the same cache manager can have different cache store configurations.

Note

If a clustered cache is configured with an unshared cache store (where shared is set to false), on node join, stale entries which might have been removed from the cluster might still be present in the stores and can reappear.

20.2. Cache Loaders and Cache Writers

Integration with the persistent store is done through the following SPIs located in org.infinispan.persistence.spi:

  • CacheLoader
  • CacheWriter
  • AdvancedCacheLoader
  • AdvancedCacheWriter

CacheLoader and CacheWriter provide basic methods for reading and writing to a store. CacheLoader retrieves data from a data store when the required data is not present in the cache, and CacheWriter is used to enforce entry passivation and activation on eviction in a cache.

AdvancedCacheLoader and AdvancedCacheWriter provide operations to manipulate the underlying storage in bulk: parallel iteration and purging of expired entries, clear and size.

The org.infinispan.persistence.file.SingleFileStore is a good starting point to write your own store implementation.

Note

Previously, JBoss Data Grid used the old API (CacheLoader, extended by CacheStore), which is also still available.

20.3. Cache Store Configuration

20.3.1. Configuring the Cache Store

Cache stores can be configured in a chain. Cache read operations checks each cache store in the order configured until a valid non-null element of data has been located. Write operations affect all cache stores unless the ignoreModifications element has been set to "true" for a specific cache store.

20.3.2. Configure the Cache Store using XML (Library Mode)

The following example demonstrates cache store configuration using XML in JBoss Data Grid’s Library mode:

<persistence passivation="false">
   <file-store shared="false"
               preload="true"
               fetch-state="true"
               purge-startup="false"
               singleton="true"
               location="${java.io.tmpdir}" >
      <write-behind enabled="true"
             flush-lock-timeout="15000"
             thread-pool-size="5" />
   </singleFile>
</persistence>

For details about the elements and parameters used in this sample configuration, see Cache Store Configuration Details (Library Mode).

20.3.3. About SKIP_CACHE_LOAD Flag

In Red Hat JBoss Data Grid’s Remote Client-Server mode, when the cache is preloaded from a cache store and eviction is disabled, read requests go to the memory. If the entry is not found in a memory during a read request, it accesses the cache store which may impact the read performance.

To avoid referring to the cache store when a key is not found in the memory, use the SKIP_CACHE_LOAD flag.

20.3.4. About the SKIP_CACHE_STORE Flag

When the SKIP_CACHE_STORE Flag is used then the cache store will not be considered for the specified cache operations. This flag can be useful to place an entry in the cache without having it included in the configured cache store, along with determining if an entry is found within a cache without retrieving it from the associated cache store.

20.3.5. About the SKIP_SHARED_CACHE_STORE Flag

When the SKIP_SHARED_CACHE_STORE Flag is enabled then any shared cache store will not be considered for the specified cache operations. This flag can be useful to place an entry in the cache without having it included in the shared cache store, along with determining if an entry is found within a cache without retrieving it from the shared cache store.

20.4. Shared Cache Stores

20.4.1. Shared Cache Stores

A shared cache store is a cache store that is shared by multiple cache instances.

A shared cache store is useful when all instances in a cluster communicate with the same remote, shared database using the same JDBC settings. In such an instance, configuring a shared cache store prevents the unnecessary repeated write operations that occur when various cache instances attempt to write the same data to the cache store.

20.4.2. Invalidation Mode and Shared Cache Stores

When used in conjunction with a shared cache store, Red Hat JBoss Data Grid’s invalidation mode causes remote caches to see the shared cache store to retrieve modified data.

The benefits of using invalidation mode in conjunction with shared cache stores include the following:

  • Compared to replication messages, which contain the updated data, invalidation messages are much smaller and result in reduced network traffic.
  • The remaining cluster caches look up modified data from the shared cache store lazily and only when required to do so, resulting in further reduced network traffic.

20.4.3. The Cache Store and Cache Passivation

In Red Hat JBoss Data Grid, a cache store can be used to enforce the passivation of entries and to activate eviction in a cache. Whether passivation mode or activation mode are used, the configured cache store both reads from and writes to the data store.

When passivation is disabled in JBoss Data Grid, after the modification, addition or removal of an element is carried out the cache store steps in to persist the changes in the store.

20.4.4. Application Cachestore Registration

It is not necessary to register an application cache store for an isolated deployment. This is not a requirement in Red Hat JBoss Data Grid because lazy deserialization is used to work around this problem.

20.5. Connection Factories

20.5.1. Connection Factories

In Red Hat JBoss Data Grid, all JDBC cache stores rely on a ConnectionFactory implementation to obtain a database connection. This process is also known as connection management or pooling.

A connection factory can be specified using the ConnectionFactoryClass configuration attribute. JBoss Data Grid includes the following ConnectionFactory implementations:

  • ManagedConnectionFactory
  • SimpleConnectionFactory.
  • PooledConnectionFactory.

20.5.2. About ManagedConnectionFactory

ManagedConnectionFactory is a connection factory that is ideal for use within managed environments such as application servers. This connection factory can explore a configured location in the JNDI tree and delegate connection management to the DataSource.

20.5.3. About SimpleConnectionFactory

SimpleConnectionFactory is a connection factory that creates database connections on a per invocation basis. This connection factory is not designed for use in a production environment.

20.5.4. About PooledConnectionFactory

PooledConnectionFactory is a connection factory based on C3P0, and is typically recommended for standalone deployments as opposed to deployments utilizing a servlet container, such as JBoss EAP. This connection factory functions by allowing the user to define a set of parameters which may be used for all DataSource instances generated by the factory.