Chapter 9. Monitoring Data Grid Servers

9.1. Working with Data Grid Server Logs

Data Grid uses Apache Log4j 2 to provide configurable logging mechanisms that capture details about the environment and record cache operations for troubleshooting purposes and root cause analysis.

9.1.1. Data Grid Log Files

Data Grid writes log messages to the following directory:
$RHDG_HOME/${infinispan.server.root}/log

server.log
Messages in human readable format, including boot logs that relate to the server startup.
Data Grid creates this file by default when you launch servers.
server.log.json
Messages in JSON format that let you parse and analyze Data Grid logs.
Data Grid creates this file when you enable the JSON-FILE appender.

9.1.2. Configuring Data Grid Log Properties

You configure Data Grid logs with log4j2.xml, which is described in the Log4j 2 manual.

Procedure

  1. Open $RHDG_HOME/${infinispan.server.root}/conf/log4j2.xml with any text editor.
  2. Change logging configuration as appropriate.
  3. Save and close log4j2.xml.

9.1.2.1. Log Levels

Log levels indicate the nature and severity of messages.

Log levelDescription

TRACE

Fine-grained debug messages, capturing the flow of individual requests through the application.

DEBUG

Messages for general debugging, not related to an individual request.

INFO

Messages about the overall progress of applications, including lifecycle events.

WARN

Events that can lead to error or degrade performance.

ERROR

Error conditions that might prevent operations or activites from being successful but do not prevent applications from running.

FATAL

Events that could cause critical service failure and application shutdown.

In addition to the levels of individual messages presented above, the configuration allows two more values: ALL to include all messages, and OFF to exclude all messages.

9.1.2.2. Data Grid Log Categories

Data Grid provides categories for INFO, WARN, ERROR, FATAL level messages that organize logs by functional area.

org.infinispan.CLUSTER
Messages specific to Data Grid clustering that include state transfer operations, rebalancing events, partitioning, and so on.
org.infinispan.CONFIG
Messages specific to Data Grid configuration.
org.infinispan.CONTAINER
Messages specific to the data container that include expiration and eviction operations, cache listener notifications, transactions, and so on.
org.infinispan.PERSISTENCE
Messages specific to cache loaders and stores.
org.infinispan.SECURITY
Messages specific to Data Grid security.
org.infinispan.SERVER
Messages specific to Data Grid servers.
org.infinispan.XSITE
Messages specific to cross-site replication operations.

9.1.2.3. Log Appenders

Log appenders define how Data Grid records log messages.

CONSOLE
Write log messages to the host standard out (stdout) or standard error (stderr) stream.
Uses the org.apache.logging.log4j.core.appender.ConsoleAppender class by default.
FILE
Write log messages to a file.
Uses the org.apache.logging.log4j.core.appender.RollingFileAppender class by default.
JSON-FILE
Write log messages to a file in JSON format.
Uses the org.apache.logging.log4j.core.appender.RollingFileAppender class by default.

9.1.2.4. Log Patterns

The CONSOLE and FILE appenders use a PatternLayout to format the log messages according to a pattern.

An example is the default pattern in the FILE appender:
%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p (%t) [%c{1}] %m%throwable%n

  • %d{yyyy-MM-dd HH:mm:ss,SSS} adds the current time and date.
  • %-5p specifies the log level, aligned to the right.
  • %t adds the name of the current thread.
  • %c{1} adds the short name of the logging category.
  • %m adds the log message.
  • %throwable adds the exception stack trace.
  • %n adds a new line.

Patterns are fully described in the PatternLayout documentation .

9.1.2.5. Enabling and Configuring the JSON Log Handler

Data Grid provides a JSON log handler to write messages in JSON format.

Prerequisites

Ensure that Data Grid is not running. You cannot dynamically enable log handlers.

Procedure

  1. Open $RHDG_HOME/${infinispan.server.root}/conf/log4j2.xml with any text editor.
  2. Uncomment the JSON-FILE appender and comment out the FILE appender:

          <!--<AppenderRef ref="FILE"/>-->
          <AppenderRef ref="JSON-FILE"/>
  3. Optionally configure the JSON appender and layout.
  4. Save and close logging.properties.

When you start Data Grid, it writes each log message as a JSON map in the following file:
$RHDG_HOME/${infinispan.server.root}/log/server.log.json

9.1.3. Access Logs

Hot Rod and REST endpoints can record all inbound client requests as log entries with the following categories:

  • org.infinispan.HOTROD_ACCESS_LOG logging category for the Hot Rod endpoint.
  • org.infinispan.REST_ACCESS_LOG logging category for the REST endpoint.

9.1.3.1. Enabling Access Logs

Access logs for Hot Rod and REST endpoints are disabled by default. To enable either logging category, set the level to TRACE in the Data Grid logging configuration, as in the following example:

<Logger name="org.infinispan.HOTROD_ACCESS_LOG" additivity="false" level="TRACE">
   <AppenderRef ref="HR-ACCESS-FILE"/>
</Logger>

9.1.3.2. Access Log Properties

The default format for access logs is as follows:

%X{address} %X{user} [%d{dd/MMM/yyyy:HH:mm:ss Z}] &quot;%X{method} %m
%X{protocol}&quot; %X{status} %X{requestSize} %X{responseSize} %X{duration}%n

The preceding format creates log entries such as the following:

127.0.0.1 - [DD/MM/YYYY:HH:MM:SS +0000] "PUT /rest/v2/caches/default/key HTTP/1.1" 404 5 77 10

Logging properties use the %X{name} notation and let you modify the format of access logs. The following are the default logging properties:

PropertyDescription

address

Either the X-Forwarded-For header or the client IP address.

user

Principal name, if using authentication.

method

Method used. PUT, GET, and so on.

protocol

Protocol used. HTTP/1.1, HTTP/2, HOTROD/2.9, and so on.

status

An HTTP status code for the REST endpoint. OK or an exception for the Hot Rod endpoint.

requestSize

Size, in bytes, of the request.

responseSize

Size, in bytes, of the response.

duration

Number of milliseconds that the server took to handle the request.

Tip

Use the header name prefixed with h: to log headers that were included in requests; for example, %X{h:User-Agent}.

9.2. Configuring Statistics, Metrics, and JMX

Enable statistics that Data Grid exports to a MicroProfile Metrics endpoint or via JMX MBeans. You can also register JMX MBeans to perform management operations.

9.2.1. Enabling Data Grid Statistics

Data Grid lets you enable statistics for Cache Managers and caches. However, enabling statistics for a Cache Manager does not enable statistics for the caches that it controls. You must explicitly enable statistics for your caches.

Note

Data Grid server enables statistics for Cache Managers by default.

Procedure

  • Enable statistics declaratively or programmatically.

Declaratively

<cache-container statistics="true"> 1
  <local-cache name="mycache" statistics="true"/> 2
</cache-container>

1
Enables statistics for the Cache Manager.
2
Enables statistics for the named cache.

Programmatically

GlobalConfiguration globalConfig = new GlobalConfigurationBuilder()
  .cacheContainer().statistics(true) 1
  .build();

 ...

Configuration config = new ConfigurationBuilder()
  .statistics().enable() 2
  .build();

1
Enables statistics for the Cache Manager.
2
Enables statistics for the named cache.

9.2.2. Enabling Data Grid Metrics

Configure Data Grid to export gauges and histograms.

Procedure

  • Configure metrics declaratively or programmatically.

Declaratively

<cache-container statistics="true"> 1
  <metrics gauges="true" histograms="true" /> 2
</cache-container>

1
Computes and collects statistics about the Cache Manager.
2
Exports collected statistics as gauge and histogram metrics.

Programmatically

GlobalConfiguration globalConfig = new GlobalConfigurationBuilder()
  .statistics().enable() 1
  .metrics().gauges(true).histograms(true) 2
  .build();

1
Computes and collects statistics about the Cache Manager.
2
Exports collected statistics as gauge and histogram metrics.

9.2.3. Collecting Data Grid Metrics

Collect Data Grid metrics with monitoring tools such as Prometheus.

Prerequisites

  • Enable statistics. If you do not enable statistics, Data Grid provides 0 and -1 values for metrics.
  • Optionally enable histograms. By default Data Grid generates gauges but not histograms.

Procedure

  • Get metrics in Prometheus (OpenMetrics) format:

    $ curl -v http://localhost:11222/metrics
  • Get metrics in MicroProfile JSON format:

    $ curl --header "Accept: application/json" http://localhost:11222/metrics

Next steps

Configure monitoring applications to collect Data Grid metrics. For example, add the following to prometheus.yml:

static_configs:
    - targets: ['localhost:11222']

Reference

9.2.4. Configuring Data Grid to Register JMX MBeans

Data Grid can register JMX MBeans that you can use to collect statistics and perform administrative operations. However, you must enable statistics separately to JMX otherwise Data Grid provides 0 values for all statistic attributes.

Procedure

  • Enable JMX declaratively or programmatically.

Declaratively

<cache-container>
  <jmx enabled="true" /> 1
</cache-container>

1
Registers Data Grid JMX MBeans.

Programmatically

GlobalConfiguration globalConfig = new GlobalConfigurationBuilder()
  .jmx().enable() 1
  .build();

1
Registers Data Grid JMX MBeans.

9.2.4.1. Data Grid MBeans

Data Grid exposes JMX MBeans that represent manageable resources.

org.infinispan:type=Cache
Attributes and operations available for cache instances.
org.infinispan:type=CacheManager
Attributes and operations available for cache managers, including Data Grid cache and cluster health statistics.

For a complete list of available JMX MBeans along with descriptions and available operations and attributes, see the Data Grid JMX Components documentation.

9.3. Retrieving Server Health Statistics

Monitor the health of your Data Grid clusters in the following ways:

  • Programmatically with embeddedCacheManager.getHealth() method calls.
  • JMX MBeans
  • Data Grid REST Server

9.3.1. Accessing the Health API via JMX

Retrieve Data Grid cluster health statistics via JMX.

Procedure

  1. Connect to Data Grid server using any JMX capable tool such as JConsole and navigate to the following object:

    org.infinispan:type=CacheManager,name="default",component=CacheContainerHealth
  2. Select available MBeans to retrieve cluster health statistics.

9.3.2. Accessing the Health API via REST

Get Data Grid cluster health via the REST API.

Procedure

  • Invoke a GET request to retrieve cluster health.

    GET /rest/v2/cache-managers/{cacheManagerName}/health

Data Grid responds with a JSON document such as the following:

{
    "cluster_health":{
        "cluster_name":"ISPN",
        "health_status":"HEALTHY",
        "number_of_nodes":2,
        "node_names":[
            "NodeA-36229",
            "NodeB-28703"
        ]
    },
    "cache_health":[
        {
            "status":"HEALTHY",
            "cache_name":"___protobuf_metadata"
        },
        {
            "status":"HEALTHY",
            "cache_name":"cache2"
        },
        {
            "status":"HEALTHY",
            "cache_name":"mycache"
        },
        {
            "status":"HEALTHY",
            "cache_name":"cache1"
        }
    ]

}
Tip

Get cache manager status as follows:

GET /rest/v2/cache-managers/{cacheManagerName}/health/status

Reference

See the REST v2 (version 2) API documentation for more information.