Language:
Format:

Chapter 10. Setting Up Persistent Storage

Data Grid can persist in-memory data to external storage, giving you additional capabilities to manage your data such as:

Durability: Adding cache stores allows you to persist data to non-volatile storage so it survives restarts.
Write-through caching: Configuring Data Grid as a caching layer in front of persistent storage simplifies data access for applications because Data Grid handles all interactions with the external storage.
Data overflow: Using eviction and passivation techniques ensures that Data Grid keeps only frequently used data in-memory and writes older entries to persistent storage.

10.1. Data Grid Cache Stores

Cache stores connect Data Grid to persistent data sources and implement the following interfaces:

org.infinispan.persistence.spi.CacheLoader: Allows Data Grid to load data from persistent storage.
org.infinispan.persistence.spi.CacheWriter: Allows Data Grid to persist data to persistent storage.

10.1.1. Configuring Cache Stores

Add cache stores to Data Grid caches in a chain either declaratively or programmatically. Cache read operations check each cache store in the configured order until they locate a valid non-null element of data. Write operations affect all cache stores except for those that you configure as read only.

Procedure

Use the persistence parameter to configure the persistence layer for caches.
Add Data Grid cache stores with the appropriate configuration, as in the following examples:

Declarative configuration for a file-based cache store

<persistence passivation="false">
   <!-- note that class is missing and is induced by the fileStore element name -->
   <file-store
           shared="false" preload="true"
           fetch-state="true"
           read-only="false"
           purge="false"
           path="${java.io.tmpdir}">
      <write-behind thread-pool-size="5" />
   </file-store>
</persistence>

Declarative configuration for a custom cache store configuration

<local-cache name="myCustomStore">
   <persistence passivation="false">
      <store
         class="org.acme.CustomStore"
         fetch-state="false" preload="true" shared="false"
         purge="true" read-only="false" segmented="true">

         <write-behind modification-queue-size="123" thread-pool-size="23" />

         <property name="myProp">${system.property}</property>
      </store>
   </persistence>
</local-cache>

Note

Custom cache stores include property parameters that let you configure specific attributes for your cache store.

Programmatic configuration for a single file cache store

ConfigurationBuilder builder = new ConfigurationBuilder();
builder.persistence()
      .passivation(false)
      .addSingleFileStore()
         .preload(true)
         .shared(false)
         .fetchPersistentState(true)
         .ignoreModifications(false)
         .purgeOnStartup(false)
         .location(System.getProperty("java.io.tmpdir"))
         .async()
            .enabled(true)
            .threadPoolSize(5)

Reference

10.1.2. Setting a Global Persistent Location for File-Based Cache Stores

Data Grid uses a global filesystem location for saving data to persistent storage.

Important

The global persistent location must be unique to each Data Grid instance. To share data between multiple instances, use a shared persistent location.

Data Grid servers use the $RHDG_HOME/server/data directory as the global persistent location.

If you are using Data Grid as a library embedded in custom applications, the global persistent location defaults to the user.dir system property. This system property typically uses the directory where your application starts. You should configure a global persistent location to use a suitable location.

Declarative configuration

<cache-container default-cache="myCache">
   <global-state>
      <persistent-location path="example" relative-to="my.data"/>
   </global-state>
   ...
</cache-container>

public GlobalStateConfigurationBuilder persistentLocation(String path, String relativeTo) {
   persistentLocation.location(example, my.data);
   return this;
}

File-Based Cache Stores and Global Persistent Location

When using file-based cache stores, you can optionally specify filesystem directories for storage. Unless absolute paths are declared, directories are always relative to the global persistent location.

For example, you configure your global persistent location as follows:

<global-state>
   <persistent-location path="/tmp/example" relative-to="my.data"/>
</global-state>

You then configure a Single File cache store that uses a path named myDataStore as follows:

<file-store path="myDataStore"/>

In this case, the configuration results in a Single File cache store in /tmp/example/myDataStore/myCache.dat

If you attempt to set an absolute path that resides outside the global persistent location, Data Grid throws the following exception:

ISPN000558: "The store location 'foo' is not a child of the global persistent location 'bar'"

Reference

10.1.3. Passivation

Passivation configures Data Grid to write entries to cache stores when it evicts those entries from memory. In this way, passivation ensures that only a single copy of an entry is maintained, either in-memory or in a cache store, which prevents unnecessary and potentially expensive writes to persistent storage.

Activation is the process of restoring entries to memory from the cache store when threads attempt to access passivated entries. For this reason, when you enable passivation, you must configure cache stores that implement both CacheWriter and CacheLoader interfaces so they can write and load entries from persistent storage.

When Data Grid evicts an entry from the cache, it notifies cache listeners that the entry is passivated then stores the entry in the cache store. When Data Grid gets an access request for an evicted entry, it lazily loads the entry from the cache store into memory and then notifies cache listeners that the entry is activated.

Note

Passivation uses the first cache loader in the Data Grid configuration and ignores all others.
Passivation is not supported with:
- Transactional stores. Passivation writes and removes entries from the store outside the scope of the actual Data Grid commit boundaries.
- Shared stores. Shared cache stores require entries to always exist in the store for other owners. For this reason, passivation is not supported because entries cannot be removed.

If you enable passivation with transactional stores or shared stores, Data Grid throws an exception.

10.1.3.1. Passivation and Cache Stores

Passivation disabled

Writes to data in memory result in writes to persistent storage.

If Data Grid evicts data from memory, then data in persistent storage includes entries that are evicted from memory. In this way persistent storage is a superset of the in-memory cache.

If you do not configure eviction, then data in persistent storage provides a copy of data in memory.

Passivation enabled

Data Grid adds data to persistent storage only when it evicts data from memory.

When Data Grid activates entries, it restores data in memory and deletes data from persistent storage. In this way, data in memory and data in persistent storage form separate subsets of the entire data set, with no intersection between the two.

Note

Entries in persistent storage can become stale when using shared cache stores. This occurs because Data Grid does not delete passivated entries from shared cache stores when they are activated.

Values are updated in memory but previously passivated entries remain in persistent storage with out of date values.

The following table shows data in memory and in persistent storage after a series of operations:

Operation	Passivation disabled	Passivation enabled	Passivation enabled with shared cache store
Insert k1.	Memory: k1 Disk: k1	Memory: k1 Disk: -	Memory: k1 Disk: -
Insert k2.	Memory: k1, k2 Disk: k1, k2	Memory: k1, k2 Disk: -	Memory: k1, k2 Disk: -
Eviction thread runs and evicts k1.	Memory: k2 Disk: k1, k2	Memory: k2 Disk: k1	Memory: k2 Disk: k1
Read k1.	Memory: k1, k2 Disk: k1, k2	Memory: k1, k2 Disk: -	Memory: k1, k2 Disk: k1
Eviction thread runs and evicts k2.	Memory: k1 Disk: k1, k2	Memory: k1 Disk: k2	Memory: k1 Disk: k1, k2
Remove k2.	Memory: k1 Disk: k1	Memory: k1 Disk: -	Memory: k1 Disk: k1

10.1.4. Cache Loaders and Transactional Caches

Only JDBC String-Based cache stores support transactional operations. If you configure caches as transactional, you should set transactional=true to keep data in persistent storage synchronized with data in memory.

For all other cache stores, Data Grid does not enlist cache loaders in transactional operations. This can result in data inconsistency if transactions succeed in modifying data in memory but do not completely apply changes to data in the cache store. In this case manual recovery does not work with cache stores.

Reference

JDBC String-Based Cache Stores

10.1.5. Segmented Cache Stores

Cache stores can organize data into hash space segments to which keys map.

Segmented stores increase read performance for bulk operations; for example, streaming over data (Cache.size, Cache.entrySet.stream), pre-loading the cache, and doing state transfer operations.

However, segmented stores can also result in loss of performance for write operations. This performance loss applies particularly to batch write operations that can take place with transactions or write-behind stores. For this reason, you should evaluate the overhead for write operations before you enable segmented stores. The performance gain for bulk read operations might not be acceptable if there is a significant performance loss for write operations.

Important

The number of segments you configure for cache stores must match the number of segments you define in the Data Grid configuration with the clustering.hash.numSegments parameter.

If you change the numSegments parameter in the configuration after you add a segmented cache store, Data Grid cannot read data from that cache store.

Reference

Key Ownership

10.1.6. Filesystem-Based Cache Stores

In most cases, filesystem-based cache stores are appropriate for local cache stores for data that overflows from memory because it exceeds size and/or time restrictions.

Warning

You should not use filesystem-based cache stores on shared file systems such as an NFS, Microsoft Windows, or Samba share. Shared file systems do not provide file locking capabilities, which can lead to data corruption.

Likewise, shared file systems are not transactional. If you attempt to use transactional caches with shared file systems, unrecoverable failures can happen when writing to files during the commit phase.

10.1.7. Write-Through

Write-Through is an cache writing mode where writes to memory and writes to cache stores are synchronous. When a client application updates a cache entry, in most cases by invoking Cache.put(), Data Grid does not return the call until it updates the cache store. This cache writing mode results in updates to the cache store concluding within the boundaries of the client thread.

The primary advantage of Write-Through mode is that the cache and cache store are updated simultaneously, which ensures that the cache store is always consistent with the cache.

However, Write-Through mode can potentially decrease performance because the need to access and update cache stores directly adds latency to cache operations.

Data Grid defaults to Write-Through mode unless you explicitly configure Write-Behind mode on cache stores.

Write-through configuration

<persistence passivation="false">
   <file-store fetch-state="true"
               read-only="false"
               purge="false" path="${java.io.tmpdir}"/>
</persistence>

Reference

Write-Behind

10.1.8. Write-Behind

Write-Behind is an cache writing mode where writes to memory are synchronous and writes to cache stores are asynchronous. When a client application updates a cache entry, Data Grid adds the update to a modification queue and then modifies the cache store in a different thread than the client thread.

You can configure the number of threads that consume the modification queue and apply updates to the underlying cache store. The modification queue fills up if there are not enough threads to handle the updates or if the underlying cache store becomes unavailable. When this occurs, Data Grid uses Write-Through mode until the modification queue can accept new entries.

Write-Behind mode provides a performance advantage over Write-Through mode because cache operations do not need to wait for updates to the underlying cache store to complete. However, data in the cache store remains inconsistent with data in the cache until the modification queue is processed. For this reason, Write-Behind mode is suitable for cache stores with low latency, such as unshared and local filesystem-based cache stores, where the time between the write to the cache and the write to the cache store is as small as possible.

Write-behind configuration

<persistence passivation="false">
   <file-store fetch-state="true"
               read-only="false"
               purge="false" path="${java.io.tmpdir}">
   <write-behind modification-queue-size="123"
                 thread-pool-size="23"
                 fail-silently="true"/>
   </file-store>
</persistence>

The preceding configuration example uses the fail-silently parameter to control what happens when either the cache store is unavailable or the modification queue is full.

If fail-silently="true" then Data Grid logs WARN messages and rejects write operations.
If fail-silently="false" then Data Grid throws exceptions if it detects the cache store is unavailable during a write operation. Likewise if the modification queue becomes full, Data Grid throws an exception.
In some cases, data loss can occur if Data Grid restarts and write operations exist in the modification queue. For example the cache store goes offline but, during the time it takes to detect that the cache store is unavailable, write operations are added to the modification queue because it is not full. If Data Grid restarts or otherwise becomes unavailable before the cache store comes back online, then the write operations in the modification queue are lost because they were not persisted.

Reference

Write-Through

10.2. Cache Store Implementations

Data Grid provides several cache store implementations that you can use. Alternatively you can provide custom cache stores.

10.2.1. Cluster Cache Loaders

ClusterCacheLoader retrieves data from other Data Grid cluster members but does not persist data. In other words, ClusterCacheLoader is not a cache store.

ClusterCacheLoader provides a non-blocking partial alternative to state transfer. ClusterCacheLoader fetches keys from other nodes on demand if those keys are not available on the local node, which is similar to lazily loading cache content.

The following points also apply to ClusterCacheLoader:

Preloading does not take effect (preload=true).
Fetching persistent state is not supported (fetch-state=true).
Segmentation is not supported.

Declarative configuration

<persistence>
   <cluster-loader remote-timeout="500"/>
</persistence>

Programmatic configuration

ConfigurationBuilder b = new ConfigurationBuilder();
b.persistence()
    .addClusterLoader()
    .remoteCallTimeout(500);

Reference

10.2.2. Single File Cache Stores

Single File cache stores, SingleFileStore, persist data to file. Data Grid also maintains an in-memory index of keys while keys and values are stored in the file. By default, Single File cache stores are segmented, which means that Data Grid creates a separate file for each segment.

Because SingleFileStore keeps an in-memory index of keys and the location of values, it requires additional memory, depending on the key size and the number of keys. For this reason, SingleFileStore is not recommended for use cases where the keys have a large size.

In some cases, SingleFileStore can also become fragmented. If the size of values continually increases, available space in the single file is not used but the entry is appended to the end of the file. Available space in the file is used only if an entry can fit within it. Likewise, if you remove all entries from memory, the single file store does not decrease in size or become defragmented.

Declarative configuration

<persistence>
   <file-store max-entries="5000"/>
</persistence>

Programmatic configuration

For embedded deployments, do the following:

ConfigurationBuilder b = new ConfigurationBuilder();
b.persistence()
    .addSingleFileStore()
    .maxEntries(5000);

For server deployments, do the following:

import org.infinispan.client.hotrod.configuration.ConfigurationBuilder;
import org.infinispan.client.hotrod.configuration.NearCacheMode;
...

ConfigurationBuilder builder = new ConfigurationBuilder();
builder
  .remoteCache("mycache")
    .configuration("<infinispan><cache-container><distributed-cache name=\"mycache\"><persistence><file-store max-entries=\"5000\"/></persistence></distributed-cache></cache-container></infinispan>");

Segmentation

Single File cache stores support segmentation and create a separate instance per segment, which results in multiple directories in the path you configure. Each directory is a number that represents the segment to which the data maps.

Reference

10.2.3. JDBC String-Based Cache Stores

JDBC String-Based cache stores, JdbcStringBasedStore, use JDBC drivers to load and store values in the underlying database.

JdbcStringBasedStore stores each entry in its own row in the table to increase throughput for concurrent loads. JdbcStringBasedStore also uses a simple one-to-one mapping that maps each key to a String object using the key-to-string-mapper interface.

Data Grid provides a default implementation, DefaultTwoWayKey2StringMapper, that handles primitive types.

Note

By default Data Grid shares are not stored, which means that all nodes in the cluster write to the underlying store on each update. If you want operations to write to the underlying database once only, you must configure JDBC store as shared.

Segmentation

JdbcStringBasedStore uses segmentation by default and requires a column in the database table to represent the segments to which entries belong.

10.2.3.1. Connection Factories

JdbcStringBasedStore relies on a ConnectionFactory implementation to connection to a database.

Data Grid provides the following ConnectionFactory implementations:

PooledConnectionFactoryConfigurationBuilder

A connection factory based on Agroal that you configure via PooledConnectionFactoryConfiguration.

Alternatively, you can specify configuration properties prefixed with org.infinispan.agroal. as in the following example:

org.infinispan.agroal.metricsEnabled=false

org.infinispan.agroal.minSize=10
org.infinispan.agroal.maxSize=100
org.infinispan.agroal.initialSize=20
org.infinispan.agroal.acquisitionTimeout_s=1
org.infinispan.agroal.validationTimeout_m=1
org.infinispan.agroal.leakTimeout_s=10
org.infinispan.agroal.reapTimeout_m=10

org.infinispan.agroal.metricsEnabled=false
org.infinispan.agroal.autoCommit=true
org.infinispan.agroal.jdbcTransactionIsolation=READ_COMMITTED
org.infinispan.agroal.jdbcUrl=jdbc:h2:mem:PooledConnectionFactoryTest;DB_CLOSE_DELAY=-1
org.infinispan.agroal.driverClassName=org.h2.Driver.class
org.infinispan.agroal.principal=sa
org.infinispan.agroal.credential=sa

You then configure Data Grid to use your properties file via PooledConnectionFactoryConfiguration.propertyFile.

Note

You should use PooledConnectionFactory with standalone deployments, rather than deployments in servlet containers.

ManagedConnectionFactoryConfigurationBuilder

A connection factory that you can can use with managed environments such as application servers. This connection factory can explore a configurable location in the JNDI tree and delegate connection management to the DataSource.

SimpleConnectionFactoryConfigurationBuilder

A connection factory that creates database connections on a per invocation basis. You should use this connection factory for test or development environments only.

Reference

10.2.3.2. JDBC String-Based Cache Store Configuration

You can configure JdbcStringBasedStore programmatically or declaratively.

Declarative configuration

Using PooledConnectionFactory

<persistence>
   <string-keyed-jdbc-store xmlns="urn:infinispan:config:store:jdbc:10.1" shared="true">
      <connection-pool connection-url="jdbc:h2:mem:infinispan_string_based;DB_CLOSE_DELAY=-1"
                       username="sa"
                       driver="org.h2.Driver"/>
      <string-keyed-table drop-on-exit="true"
                          prefix="ISPN_STRING_TABLE">
         <id-column name="ID_COLUMN" type="VARCHAR(255)" />
         <data-column name="DATA_COLUMN" type="BINARY" />
         <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT" />
         <segment-column name="SEGMENT_COLUMN" type="INT" />
      </string-keyed-table>
   </string-keyed-jdbc-store>
</persistence>

Using ManagedConnectionFactory

<persistence>
  <string-keyed-jdbc-store xmlns="urn:infinispan:config:store:jdbc:10.1" shared="true">
    <data-source jndi-url="java:/StringStoreWithManagedConnectionTest/DS" />
    <string-keyed-table drop-on-exit="true"
                        create-on-start="true"
                        prefix="ISPN_STRING_TABLE">
        <id-column name="ID_COLUMN" type="VARCHAR(255)" />
        <data-column name="DATA_COLUMN" type="BINARY" />
        <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT" />
        <segment-column name="SEGMENT_COLUMN" type="INT"/>
    </string-keyed-table>
  </string-keyed-jdbc-store>
</persistence>

Programmatic configuration

Using PooledConnectionFactory

ConfigurationBuilder builder = new ConfigurationBuilder();
builder.persistence().addStore(JdbcStringBasedStoreConfigurationBuilder.class)
      .fetchPersistentState(false)
      .ignoreModifications(false)
      .purgeOnStartup(false)
      .shared(true)
      .table()
         .dropOnExit(true)
         .createOnStart(true)
         .tableNamePrefix("ISPN_STRING_TABLE")
         .idColumnName("ID_COLUMN").idColumnType("VARCHAR(255)")
         .dataColumnName("DATA_COLUMN").dataColumnType("BINARY")
         .timestampColumnName("TIMESTAMP_COLUMN").timestampColumnType("BIGINT")
         .segmentColumnName("SEGMENT_COLUMN").segmentColumnType("INT")
      .connectionPool()
         .connectionUrl("jdbc:h2:mem:infinispan_string_based;DB_CLOSE_DELAY=-1")
         .username("sa")
         .driverClass("org.h2.Driver");

Using ManagedConnectionFactory

ConfigurationBuilder builder = new ConfigurationBuilder();
builder.persistence().addStore(JdbcStringBasedStoreConfigurationBuilder.class)
      .fetchPersistentState(false)
      .ignoreModifications(false)
      .purgeOnStartup(false)
      .shared(true)
      .table()
         .dropOnExit(true)
         .createOnStart(true)
         .tableNamePrefix("ISPN_STRING_TABLE")
         .idColumnName("ID_COLUMN").idColumnType("VARCHAR(255)")
         .dataColumnName("DATA_COLUMN").dataColumnType("BINARY")
         .timestampColumnName("TIMESTAMP_COLUMN").timestampColumnType("BIGINT")
         .segmentColumnName("SEGMENT_COLUMN").segmentColumnType("INT")
      .dataSource()
         .jndiUrl("java:/StringStoreWithManagedConnectionTest/DS");

Reference

10.2.4. JPA Cache Stores

JPA (Java Persistence API) cache stores, JpaStore, use formal schema to persist data. Other applications can then read from persistent storage to load data from Data Grid. However, other applications should not use persistent storage concurrently with Data Grid.

When using JpaStore, you should take the following into consideration:

Keys should be the ID of the entity. Values should be the entity object.
Only a single @Id or @EmbeddedId annotation is allowed.
Auto-generated IDs with the @GeneratedValue annotation are not supported.
All entries are stored as immortal.
JpaStore does not support segmentation.

Declarative configuration

<local-cache name="vehicleCache">
   <persistence passivation="false">
      <jpa-store xmlns="urn:infinispan:config:store:jpa:10.1"
         persistence-unit="org.infinispan.persistence.jpa.configurationTest"
         entity-class="org.infinispan.persistence.jpa.entity.Vehicle">
		/>
   </persistence>
</local-cache>

Parameter	Description
`persistence-unit`	Specifies the JPA persistence unit name in the JPA configuration file, `persistence.xml`, that contains the JPA entity class.
`entity-class`	Specifies the fully qualified JPA entity class name that is expected to be stored in this cache. Only one class is allowed.

Programmatic configuration

Configuration cacheConfig = new ConfigurationBuilder().persistence()
             .addStore(JpaStoreConfigurationBuilder.class)
             .persistenceUnitName("org.infinispan.loaders.jpa.configurationTest")
             .entityClass(User.class)
             .build();

Parameter	Description
`persistenceUnitName`	Specifies the JPA persistence unit name in the JPA configuration file, `persistence.xml`, that contains the JPA entity class.
`entityClass`	Specifies the fully qualified JPA entity class name that is expected to be stored in this cache. Only one class is allowed.

Reference

10.2.4.1. JPA Cache Store Usage Example

This section provides an example for using JPA cache stores.

Prerequistes

Configure Data Grid to marshall your JPA entities. By default, Data Grid uses ProtoStream for marshalling Java objects. To marshall JPA entities, you must create a SerializationContextInitializer implementation that registers a .proto schema and marshaller with a SerializationContext.

Procedure

Define a persistence unit "myPersistenceUnit" in persistence.xml.

<persistence-unit name="myPersistenceUnit">
	...
</persistence-unit>

Create a user entity class.

@Entity
public class User implements Serializable {
	@Id
	private String username;
	private String firstName;
	private String lastName;

	...
}

Configure a cache named "usersCache" with a JPA cache store.

Then you can configure a cache "usersCache" to use JPA Cache Store, so that when you put data into the cache, the data would be persisted into the database based on JPA configuration.

EmbeddedCacheManager cacheManager = ...;

Configuration cacheConfig = new ConfigurationBuilder().persistence()
            .addStore(JpaStoreConfigurationBuilder.class)
            .persistenceUnitName("org.infinispan.loaders.jpa.configurationTest")
            .entityClass(User.class)
            .build();
cacheManager.defineCache("usersCache", cacheConfig);

Cache<String, User> usersCache = cacheManager.getCache("usersCache");
usersCache.put("raytsang", new User(...));

Caches that use a JPA cache store can store one type of data only, as in the following example:

Cache<String, User> usersCache = cacheManager.getCache("myJPACache");
// Cache is configured for the User entity class
usersCache.put("username", new User());
// Cannot configure caches to use another entity class with JPA cache stores
Cache<Integer, Teacher> teachersCache = cacheManager.getCache("myJPACache");
teachersCache.put(1, new Teacher());
// The put request does not work for the Teacher entity class

The @EmbeddedId annotation allows you to use composite keys, as in the following example:

@Entity
public class Vehicle implements Serializable {
	@EmbeddedId
	private VehicleId id;
	private String color;	...
}

@Embeddable
public class VehicleId implements Serializable
{
	private String state;
	private String licensePlate;
	...
}

References

10.2.5. Remote Cache Stores

Remote cache stores, RemoteStore, use a remote Data Grid cluster as storage.

RemoteStore uses the Hot Rod protocol to communicate with remote Data Grid clusters.

In the following configuration examples, RemoteStore uses the remote cache named "mycache" on Data Grid servers "one" and "two":

Declarative configuration

<persistence>
   <remote-store xmlns="urn:infinispan:config:store:remote:10.1" cache="mycache" raw-values="true">
      <remote-server host="one" port="12111" />
      <remote-server host="two" />
      <connection-pool max-active="10" exhausted-action="CREATE_NEW" />
      <write-behind />
   </remote-store>
</persistence>

Programmatic configuration

ConfigurationBuilder b = new ConfigurationBuilder();
b.persistence().addStore(RemoteStoreConfigurationBuilder.class)
      .fetchPersistentState(false)
      .ignoreModifications(false)
      .purgeOnStartup(false)
      .remoteCacheName("mycache")
      .rawValues(true)
.addServer()
      .host("one").port(12111)
      .addServer()
      .host("two")
      .connectionPool()
      .maxActive(10)
      .exhaustedAction(ExhaustedAction.CREATE_NEW)
      .async().enable();

Segmentation

RemoteStore supports segmentation and can publish keys and entries by segment, which makes bulk operations more efficient. However, segmentation is available with Data Grid Hot Rod protocol version 2.3 or later.

Warning

Ensure the number of segments and hash are the same between the Remote cache store and Data Grid servers, otherwise bulk operations do not return correct results.

Reference

10.2.6. RocksDB Cache Stores

RocksDB provides key-value filesystem-based storage with high performance and reliability for highly concurrent environments.

RocksDB cache stores, RocksDBStore, use two databases. One database provides a primary cache store for data in memory; the other database holds entries that Data Grid expires from memory.

Declarative configuration

<local-cache name="vehicleCache">
   <persistence>
      <rocksdb-store xmlns="urn:infinispan:config:store:rocksdb:10.1" path="rocksdb/data">
         <expiration path="rocksdb/expired"/>
      </rocksdb-store>
   </persistence>
</local-cache>

Programmatic configuration

Configuration cacheConfig = new ConfigurationBuilder().persistence()
				.addStore(RocksDBStoreConfigurationBuilder.class)
				.build();
EmbeddedCacheManager cacheManager = new DefaultCacheManager(cacheConfig);

Cache<String, User> usersCache = cacheManager.getCache("usersCache");
usersCache.put("raytsang", new User(...));

Properties props = new Properties();
props.put("database.max_background_compactions", "2");
props.put("data.write_buffer_size", "512MB");

Configuration cacheConfig = new ConfigurationBuilder().persistence()
				.addStore(RocksDBStoreConfigurationBuilder.class)
				.location("rocksdb/data")
				.expiredLocation("rocksdb/expired")
        .properties(props)
				.build();

Table 10.1. RocksDBStore configuration parameters

Parameter	Description
`location`	Specifies the path to the RocksDB database that provides the primary cache store. If you do not set the location, it is automatically created. Note that the path must be relative to the global persistent location.
`expiredLocation`	Specifies the path to the RocksDB database that provides the cache store for expired data. If you do not set the location, it is automatically created. Note that the path must be relative to the global persistent location.
`expiryQueueSize`	Sets the size of the in-memory queue for expiring entries. When the queue reaches the size, Data Grid flushes the expired into the RocksDB cache store.
`clearThreshold`	Sets the maximum number of entries before deleting and re-initializing (re-init) the RocksDB database. For smaller size cache stores, iterating through all entries and removing each one individually can provide a faster method.

RocksDB tuning parameters

You can also specify the following RocksDB tuning parameters:

compressionType
blockSize
cacheSize

RocksDB configuration properties

Optionally set properties in the configuration as follows:

Prefix properties with database to adjust and tune RocksDB databases.
Prefix properties with data to configure the column families in which RocksDB stores your data.

<property name="database.max_background_compactions">2</property>
<property name="data.write_buffer_size">64MB</property>

Segmentation

RocksDBStore supports segmentation and creates a separate column family per segment. Segmented RocksDB cache stores improve lookup performance and iteration but slightly lower performance of write operations.

Note

You should not configure more than a few hundred segments. RocksDB is not designed to have an unlimited number of column families. Too many segments also significantly increases cache store start time.

Reference

10.2.7. Soft-Index File Stores

Soft-Index File cache stores, SoftIndexFileStore, provide local file-based storage.

SoftIndexFileStore is a Java implementation that uses a variant of B+ Tree that is cached in-memory using Java soft references. The B+ Tree, called Index is offloaded on the file system to a single file that is purged and rebuilt each time the cache store restarts.

SoftIndexFileStore stores data in a set of files rather than a single file. When usage of any file drops below 50%, the entries in the file are overwritten to another file and the file is then deleted.

SoftIndexFileStore persists data in a set of files that are written in an append-only method. For this reason, if you use SoftIndexFileStore on conventional magnetic disk, it does not need to seek when writing a burst of entries.

Most structures in SoftIndexFileStore are bounded, so out-of-memory exceptions do not pose a risk. You can also configure limits for concurrently open files.

By default the size of a node in the Index is limited to 4096 bytes. This size also limits the key length; more precisely the length of serialized keys. For this reason, you cannot use keys longer than the size of the node, 15 bytes. Additionally, key length is stored as "short", which limits key length to 32767 bytes. SoftIndexFileStore throws an exception if keys are longer after serialization occurs.

SoftIndexFileStore cannot detect expired entries, which can lead to excessive usage of space on the file system .

Note

AdvancedStore.purgeExpired() is not implemented in SoftIndexFileStore.

Declarative configuration

<persistence>
    <soft-index-file-store xmlns="urn:infinispan:config:store:soft-index:10.1">
        <index path="testCache/index" />
        <data path="testCache/data" />
    </soft-index-file-store>
</persistence>

Programmatic configuration

ConfigurationBuilder b = new ConfigurationBuilder();
b.persistence()
    .addStore(SoftIndexFileStoreConfigurationBuilder.class)
        .indexLocation("testCache/index");
        .dataLocation("testCache/data")

Segmentation

Soft-Index File cache stores support segmentation and create a separate instance per segment, which results in multiple directories in the path you configure. Each directory is a number that represents the segment to which the data maps.

Reference

10.2.8. Custom Cache Stores

You can create custom cache stores that implement one or more of the Data Grid persistent SPIs.

Reference

10.2.8.1. Implementing Custom Cache Stores

Create custom cache stores that implement both CacheWriter and CacheLoader interfaces to fetch and persist data to external storage.

Implement the appropriate Data Grid persistent SPIs.
Annotate your store class with the @Store annotation and specify the appropriate properties.
For example, if your cache store is shared, use the @Store(shared = true) annotation.
Create a custom cache store configuration and builder.
1. Extend AbstractStoreConfiguration and AbstractStoreConfigurationBuilder.
  Extend AbstractSegmentedStoreConfiguration instead of AbstractStoreConfiguration for a segmented cache store that creates a different store instance per segment.
2. Optionally add the following annotations to ensure that your custom configuration builder parses your cache store configuration from XML:
  - @ConfigurationFor
  - @BuiltBy
  - @ConfiguredBy
    If you do not add these annotations, then CustomStoreConfigurationBuilder parses the common store attributes defined in AbstractStoreConfiguration and any additional elements are ignored.
    Note
    If a store and its configuration do not declare the @Store and @ConfigurationFor annotations, a warning message is logged when Data Grid initializes the cache.

10.2.8.2. Custom Cache Store Configuration

After you implement your custom cache store, configure Data Grid to use it.

Declarative configuration

<local-cache name="customStoreExample">
  <persistence>
    <store class="org.infinispan.persistence.dummy.DummyInMemoryStore" />
  </persistence>
</local-cache>

Programmatic configuration

Configuration config = new ConfigurationBuilder()
            .persistence()
            .addStore(CustomStoreConfigurationBuilder.class)
            .build();

10.2.8.3. Deploying Custom Cache Stores

You can package custom cache stores into JAR files and deploy them to Data Grid servers as follows:

Package your custom cache store implementation in a JAR file.
Add a file under META-INF/services/ that contains the fully qualified class name of your store implementation.
The name of the service file should reflect the interface that your store implements. For example, if your store implements the AdvancedCacheWriter interface then you should create the following file:
/META-INF/services/org.infinispan.persistence.spi.AdvancedCacheWriter
Add your JAR file to the server/lib directory of your Data Grid server.

10.3. Data Grid Persistence SPIs

Data Grid Service Provider Interfaces (SPI) enable read and write operations to external storage and provide the following features:

Portability across JCache-compliant vendors: The Data Grid CacheWriter and CacheLoader interfaces align with the JSR-107 JCache specification.
Simplified transaction integration: Data Grid automatically handles locking so your implementations do not need to coordinate concurrent access to persistent stores. Depending on the locking mode you use, concurrent writes to the same key generally do not occur. However, you should expect operations on the persistent storage to originate from multiple threads and create implementations to tolerate this behavior.
Parallel iteration: Data Grid lets you iterate over entries in persistent stores with multiple threads in parallel.
Reduced serialization resulting in less CPU usage: Data Grid exposes stored entries in a serialized format that can be transmitted remotely. For this reason, Data Grid does not need to deserialize entries that it retrieves from persistent storage and then serialize again when writing to the wire.

Reference

10.3.1. Persistence SPI Classes

The following notes apply to Data Grid persistence SPI classes:

ByteBuffer: Abstracts the serialized form of an object.
MarshallableEntry: Abstracts the information held within a persistent store corresponding to a key/value added to the cache. Provides a method for reading this information both in serialized (ByteBuffer) and deserialized (Object) format. Normally data read from the store is kept in serialized format and lazily deserialized on demand, within the MarshallableEntry implementation.
CacheWriter and CacheLoader: Provide basic methods for writing to and reading from cache stores.
AdvancedCacheLoader and AdvancedCacheWriter: Provide bulk operations to manipulate the underlaying storage, such as parallel iteration and purging of expired entries, clear and size.
SegmentedAdvancedLoadWriteStore: Provides all the operations that deal with segments.

Cache stores can be segmented if they do one of the following:

Implement the SegmentedAdvancedLoadWriteStore interface. In this case only a single store instance is used per cache.
Has a configuration that extends the AbstractSegmentedConfiguration abstract class.
This requires you to implement the newConfigurationFrom() method where it is expected that a new StoreConfiguration instance is created per invocation. This creates a store instance per segment to which a node can write. Stores might start and stop as data is moved between nodes.

A provider might choose to only implement a subset of these interfaces:

Not implementing the AdvancedCacheWriter makes the given writer not usable for purging expired entries or clear
If a loader does not implement the AdvancedCacheLoader interface, then it will not participate in preloading nor in cache iteration (required also for stream operations).

10.4. Migrating Between Cache Stores

Data Grid provides a utility to migrate data from one cache store to another.

10.4.1. Cache Store Migrator

Data Grid provides the StoreMigrator.java utility that recreates data for the latest Data Grid cache store implementations.

StoreMigrator takes a cache store from a previous version of Data Grid as source and uses a cache store implementation as target.

When you run StoreMigrator, it creates the target cache with the cache store type that you define using the EmbeddedCacheManager interface. StoreMigrator then loads entries from the source store into memory and then puts them into the target cache.

StoreMigrator also lets you migrate data from one type of cache store to another. For example, you can migrate from a JDBC String-Based cache store to a Single File cache store.

Important

StoreMigrator cannot migrate data from segmented cache stores to:

Non-segmented cache store.
Segmented cache stores that have a different number of segments.

10.4.2. Getting the Store Migrator

StoreMigrator is available as part of the Data Grid tools library, infinispan-tools, and is included in the Maven repository.

Procedure

Configure your pom.xml for StoreMigrator as follows:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.infinispan.example</groupId>
    <artifactId>jdbc-migrator-example</artifactId>
    <version>1.0-SNAPSHOT</version>

    <dependencies>
      <dependency>
        <groupId>org.infinispan</groupId>
        <artifactId>infinispan-tools</artifactId>
      </dependency>
      <!-- Additional dependencies -->
    </dependencies>

    <build>
      <plugins>
        <plugin>
          <groupId>org.codehaus.mojo</groupId>
          <artifactId>exec-maven-plugin</artifactId>
          <version>1.2.1</version>
          <executions>
            <execution>
              <goals>
                <goal>java</goal>
              </goals>
            </execution>
          </executions>
          <configuration>
            <mainClass>org.infinispan.tools.store.migrator.StoreMigrator</mainClass>
            <arguments>
              <argument>path/to/migrator.properties</argument>
            </arguments>
          </configuration>
        </plugin>
      </plugins>
    </build>
</project>

10.4.3. Configuring the Store Migrator

Set properties for source and target cache stores in a migrator.properties file.

Procedure

Create a migrator.properties file.
Configure the source cache store in migrator.properties.
1. Prepend all configuration properties with source. as in the following example:
```
source.type=SOFT_INDEX_FILE_STORE
source.cache_name=myCache
source.location=/path/to/source/sifs
```
Configure the target cache store in migrator.properties.
1. Prepend all configuration properties with target. as in the following example:
```
target.type=SINGLE_FILE_STORE
target.cache_name=myCache
target.location=/path/to/target/sfs.dat
```

10.4.3.1. Store Migrator Properties

Configure source and target cache stores in a StoreMigrator properties.

Table 10.2. Cache Store Type Property

Property Description Required/Optional

Property	Description	Required/Optional
`type`	Specifies the type of cache store type for a source or target. `.type=JDBC_STRING` `.type=JDBC_BINARY` `.type=JDBC_MIXED` `.type=LEVELDB` `.type=ROCKSDB` `.type=SINGLE_FILE_STORE` `.type=SOFT_INDEX_FILE_STORE` `.type=JDBC_MIXED`	Required

type

Specifies the type of cache store type for a source or target.

.type=JDBC_STRING

.type=JDBC_BINARY

.type=JDBC_MIXED

.type=LEVELDB

.type=ROCKSDB

.type=SINGLE_FILE_STORE

.type=SOFT_INDEX_FILE_STORE

.type=JDBC_MIXED

Required

Table 10.3. Common Properties

Property Description Example Value Required/Optional

Property	Description	Example Value	Required/Optional
`cache_name`	Names the cache that the store backs.	`.cache_name=myCache`	Required
`segment_count`	Specifies the number of segments for target cache stores that can use segmentation. The number of segments must match `clustering.hash.numSegments` in the Data Grid configuration. In other words, the number of segments for a cache store must match the number of segments for the corresponding cache. If the number of segments is not the same, Data Grid cannot read data from the cache store.	`.segment_count=256`	Optional

cache_name

Names the cache that the store backs.

.cache_name=myCache

Required

segment_count

Specifies the number of segments for target cache stores that can use segmentation.

The number of segments must match clustering.hash.numSegments in the Data Grid configuration.

In other words, the number of segments for a cache store must match the number of segments for the corresponding cache. If the number of segments is not the same, Data Grid cannot read data from the cache store.

.segment_count=256

Optional

Table 10.4. JDBC Properties

Property	Description	Required/Optional
`dialect`	Specifies the dialect of the underlying database.	Required
`version`	Specifies the marshaller version for source cache stores. Set one of the following values: * `8` for Data Grid 7.2.x * `9` for Data Grid 7.3.x * `10` Data Grid 8.x	Required for source stores only. For example: `source.version=9`
`marshaller.class`	Specifies a custom marshaller class.	Required if using custom marshallers.
`marshaller.externalizers`	Specifies a comma-separated list of custom `AdvancedExternalizer` implementations to load in this format: `[id]:<Externalizer class>`	Optional
`connection_pool.connection_url`	Specifies the JDBC connection URL.	Required
`connection_pool.driver_class`	Specifies the class of the JDBC driver.	Required
`connection_pool.username`	Specifies a database username.	Required
`connection_pool.password`	Specifies a password for the database username.	Required
`db.major_version`	Sets the database major version.	Optional
`db.minor_version`	Sets the database minor version.	Optional
`db.disable_upsert`	Disables database upsert.	Optional
`db.disable_indexing`	Specifies if table indexes are created.	Optional
`table.string.table_name_prefix`	Specifies additional prefixes for the table name.	Optional
`table.string.<id\|data\|timestamp>.name`	Specifies the column name.	Required
`table.string.<id\|data\|timestamp>.type`	Specifies the column type.	Required
`key_to_string_mapper`	Specifies the `TwoWayKey2StringMapper` class.	Optional

Note

To migrate from Binary cache stores in older Data Grid versions, change table.string.* to table.binary.\* in the following properties:

source.table.binary.table_name_prefix
source.table.binary.<id\|data\|timestamp>.name
source.table.binary.<id\|data\|timestamp>.type

# Example configuration for migrating to a JDBC String-Based cache store
target.type=STRING
target.cache_name=myCache
target.dialect=POSTGRES
target.marshaller.class=org.example.CustomMarshaller
target.marshaller.externalizers=25:Externalizer1,org.example.Externalizer2
target.connection_pool.connection_url=jdbc:postgresql:postgres
target.connection_pool.driver_class=org.postrgesql.Driver
target.connection_pool.username=postgres
target.connection_pool.password=redhat
target.db.major_version=9
target.db.minor_version=5
target.db.disable_upsert=false
target.db.disable_indexing=false
target.table.string.table_name_prefix=tablePrefix
target.table.string.id.name=id_column
target.table.string.data.name=datum_column
target.table.string.timestamp.name=timestamp_column
target.table.string.id.type=VARCHAR
target.table.string.data.type=bytea
target.table.string.timestamp.type=BIGINT
target.key_to_string_mapper=org.infinispan.persistence.keymappers. DefaultTwoWayKey2StringMapper

Table 10.5. RocksDB Properties

Property	Description	Required/Optional
`location`	Sets the database directory.	Required
`compression`	Specifies the compression type to use.	Optional

# Example configuration for migrating from a RocksDB cache store.
source.type=ROCKSDB
source.cache_name=myCache
source.location=/path/to/rocksdb/database
source.compression=SNAPPY

Table 10.6. SingleFileStore Properties

Property	Description	Required/Optional
`location`	Sets the directory that contains the cache store `.dat` file.	Required

# Example configuration for migrating to a Single File cache store.
target.type=SINGLE_FILE_STORE
target.cache_name=myCache
target.location=/path/to/sfs.dat

Table 10.7. SoftIndexFileStore Properties

Property	Description	Value
Required/Optional	`location`	Sets the database directory.
Required	`index_location`	Sets the database index directory.

# Example configuration for migrating to a Soft-Index File cache store.
target.type=SOFT_INDEX_FILE_STORE
target.cache_name=myCache
target.location=path/to/sifs/database
target.location=path/to/sifs/index

10.4.4. Migrating Cache Stores

Run StoreMigrator to migrate data from one cache store to another.

Prerequisites

Get infinispan-tools.jar.
Create a migrator.properties file that configures the source and target cache stores.

Procedure

If you build infinispan-tools.jar from source, do the following:
1. Add infinispan-tools.jar and dependencies for your source and target databases, such as JDBC drivers, to your classpath.
2. Specify migrator.properties file as an argument for StoreMigrator.
If you pull infinispan-tools.jar from the Maven repository, run the following command:
mvn exec:java

Language and Page Formatting Options

Chapter 10. Setting Up Persistent Storage

10.1. Data Grid Cache Stores

10.1.1. Configuring Cache Stores

10.1.2. Setting a Global Persistent Location for File-Based Cache Stores

10.1.3. Passivation

10.1.3.1. Passivation and Cache Stores

10.1.4. Cache Loaders and Transactional Caches

10.1.5. Segmented Cache Stores

10.1.6. Filesystem-Based Cache Stores

10.1.7. Write-Through

10.1.8. Write-Behind

10.2. Cache Store Implementations

10.2.1. Cluster Cache Loaders

10.2.2. Single File Cache Stores

10.2.3. JDBC String-Based Cache Stores

10.2.3.1. Connection Factories

10.2.3.2. JDBC String-Based Cache Store Configuration

10.2.4. JPA Cache Stores

10.2.4.1. JPA Cache Store Usage Example

10.2.5. Remote Cache Stores

10.2.6. RocksDB Cache Stores

10.2.7. Soft-Index File Stores

10.2.8. Custom Cache Stores

10.2.8.1. Implementing Custom Cache Stores

10.2.8.2. Custom Cache Store Configuration

10.2.8.3. Deploying Custom Cache Stores

10.3. Data Grid Persistence SPIs

10.3.1. Persistence SPI Classes

10.4. Migrating Between Cache Stores

10.4.1. Cache Store Migrator

10.4.2. Getting the Store Migrator

10.4.3. Configuring the Store Migrator

10.4.3.1. Store Migrator Properties

10.4.4. Migrating Cache Stores

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links