Chapter 6. Monitoring and tuning Data Grid queries

Data Grid exposes statistics for queries and provides attributes that you can adjust to improve query performance.

6.1. Getting query statistics

Collect statistics to gather information about performance of your indexes and queries, including information such as the types of indexes and average time for queries to complete.

Procedure

Do one of the following:

  • Invoke the getSearchStatistics() or getClusteredSearchStatistics() methods for embedded caches.
  • Use GET requests to obtain statistics for remote caches from the REST API.

Embedded caches

// Statistics for the local cluster member
SearchStatistics statistics = Search.getSearchStatistics(cache);

// Consolidated statistics for the whole cluster
CompletionStage<SearchStatisticsSnapshot> statistics = Search.getClusteredSearchStatistics(cache)

Remote caches

GET /v2/caches/{cacheName}/search/stats

6.2. Tuning query performance

Use the following guidelines to help you improve the performance of indexing operations and queries.

Checking index usage statistics

Queries against partially indexed caches return slower results. For instance, if some fields in a schema are not annotated then the resulting index does not include those fields.

Start tuning query performance by checking the time it takes for each type of query to run. If your queries seem to be slow, you should make sure that queries are using the indexes for caches and that all entities and field mappings are indexed.

Adjusting the commit interval for indexes

Indexing can degrade write throughput for Data Grid clusters. The commit-interval attribute defines the interval, in milliseconds, between which index changes that are buffered in memory are flushed to the index storage and a commit is performed.

This operation is costly so you should avoid configuring an interval that is too small. The default is 1000 ms (1 second).

Adjusting the refresh interval for queries

The refresh-interval attribute defines the interval, in milliseconds, between which the index reader is refreshed.

The default value is 0, which returns data in queries as soon as it is written to a cache.

A value greater than 0 results in some stale query results but substantially increases throughput, especially in write-heavy scenarios. If you do not need data returned in queries as soon as it is written, you should adjust the refresh interval to improve query performance.