Chapter 12. Monitoring Open Liberty

You can use MicroProfile Metrics and MicroProfile Health to monitor microservices and applications that run on Open Liberty. Enabling and reporting metric and health check data for your microservices helps you pinpoint issues, collect data for capacity planning, and decide when to scale a service up or down.

12.1. Monitoring with metrics

With Open Liberty, two types of metrics are available to monitor your applications, REST endpoint-style metrics that are provided by MicroProfile Metrics, and Java Management Extensions (JMX) metrics.

MicroProfile Metrics can be accessed by monitoring tools, such as Prometheus, and by any client that’s capable of making REST requests. JMX metrics are suitable for use by Java-based monitoring tools that can communicate with JMX servers, or by custom JMX clients.

12.1.1. MicroProfile Metrics and the /metrics endpoint

The MicroProfile Metrics feature provides a /metrics REST interface that conforms to the MicroProfile Metrics specification. Open Liberty uses MicroProfile Metrics to expose metrics that describe the internal state of many Open Liberty components. Developers can also use the MicroProfile Metrics API to expose metrics from their applications.

Metrics come in various forms that include counters, gauges, timers, histograms, and meters. You can access MicroProfile Metrics by enabling the MicroProfile Metrics feature. Real-time values of all metrics are available by calling the /metrics endpoint. For more information about adding these metrics to your applications, see Microservice observability with metrics. For a list of all REST endpoint-style metrics that are available for Open Liberty, see the metrics reference list.

The /metrics endpoint provides two output formats. The format that each response uses depends on the HTTP accept header of the corresponding request. Prometheus format is a representation of the metrics that is compatible with the Prometheus monitoring tool, which is an open source metrics scraper, data store, and basic visualization tool. This format is returned for requests with a text/plain accept header. JSON format is a JSON representation of the metrics. This format is returned for requests with an application/json accept header.

The following table displays the different endpoints that can be accessed to provide metrics in Prometheus or JSON format with a GET request:

Table 12.1. Metrics endpoints that are accessible by GET request

EndpointRequest typeSupported formatsDescription

/metrics

GET

Prometheus, JSON

Returns all registered metrics.

/metrics/<scope>

GET

Prometheus, JSON

Returns metrics that are registered for the specified scope.

/metrics/<scope>/<metric name>

GET

Prometheus, JSON

Returns metrics that match the metric name for the specified scope.

12.1.2. JMX metrics

You can access JMX metrics by enabling the Performance Monitoring feature. After you add this feature to your server configuration, JMX metrics are automatically monitored. The Performance Monitoring feature provides MXBeans that you can use with monitoring tools that use JMX, such as JConsole.

JConsole is a graphical monitoring tool that you can use to monitor JVM and Java applications. After you enable monitoring for Open Liberty, you can use JConsole to connect to the JVM and view performance data by clicking attributes of the MXBeans. You can also use other products that consume JMX metrics to view your metrics information. For a list of all JMX metrics that are available for Open Liberty, see the JMX metrics reference list.

12.1.3. Both types of metrics combined

The MicroProfile Metrics feature provides basic metrics about the JVM and about metrics that are added to applications by using the MicroProfile Metrics API. However, the MicroProfile Metrics feature doesn’t provide metrics about Open Liberty server components unless it’s used along with the Performance Monitoring feature.

If you enable both the MicroProfile Metrics and Performance Monitoring features, then the MXBeans for monitoring and the /metrics REST endpoint are activated. In this case, metrics for Open Liberty server components are exposed through both interfaces. The MicroProfile Metrics feature version 2.3 and later automatically enables the Performance Monitoring feature.

12.2. Health checks for microservices

A health check is a special REST API implementation that you can use to validate the status of a microservice and its dependencies. MicroProfile Health enables microservices in an application to self-check their health and then publishes the overall health status to a defined endpoint.

A health check can assess anything that a microservice needs, including dependencies, system properties, database connections, endpoint connections, and resource availability. Services report as their availability by implementing the API provided by MicroProfile Health. When a microservice is available, it reports an UP status. If a microservice is unavailable, it reports a DOWN status. A service orchestrator can then use these status reports to decide how to manage and scale the microservices within an application. Health checks can also interact with Kubernetes liveness and readiness probes.

For example, in a microservices-based banking application, you might implement health checks on the login microservice, the balance transfer microservice, and the bill pay microservice. If a health check on the balance transfer microservice detects a bug or deadlock, it returns a DOWN status. In this case, if the /health/live endpoint is configured in a Kubernetes liveness probe, the probe restarts the microservice automatically. Restarting the microservice saves the user from encountering downtime or an error and preserves the functions of the other microservices in the application.

12.2.1. MicroProfile Health endpoints and annotations

MicroProfile Health provides the following three endpoints:

  • /health/ready: returns the readiness of a microservice, or whether it is ready to process requests. This endpoint corresponds to the Kubernetes readiness probe.
  • /health/live: returns the liveness of a microservice, or whether it encountered a bug or deadlock. If this check fails, the microservice is not running and can be stopped. This endpoint corresponds to the Kubernetes liveness probe, which automatically restarts the pod if the check fails.
  • /health: aggregates the responses from the /health/live and /health/ready endpoints. This endpoint corresponds to the deprecated @Health annotation and is only available to provide compatibility with MicroProfile 1.0. If you are using MicroProfile 2.0 or higher, use the /health/ready or /health/live endpoints instead.

To implement readiness or liveness health checks, add either the @Liveness or @Readiness annotation to your code. These annotations link the provided endpoints to Kubernetes liveness and readiness probes.

The following example demonstrates the @Liveness annotation, which checks the heap memory usage. If memory consumption is less than 90%, it returns an UP status. If memory usage exceeds 90%, it returns a DOWN status:

@Liveness
@ApplicationScoped
public class MemoryCheck implements HealthCheck {
 @Override
 public HealthCheckResponse call() {
        // status is up if used memory is < 90% of max
        MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();
        long memUsed = memoryBean.getHeapMemoryUsage().getUsed();
        long memMax = memoryBean.getHeapMemoryUsage().getMax();

        HealthCheckResponse response = HealthCheckResponse.named("heap-memory")
                .withData("used", memUsed)
                .withData("max", memMax)
                .state(memUsed < memMax * 0.9)
                .build();
        return response;
    }
}

The following example shows the JSON response from the /health/live endpoint if heap memory usage is less than 90%. The first status indicates the overall status of all health checks that are returned from the endpoint. The second status indicates the status of the particular check that is specified by the preceding name value, which in this example is heap-memory. In order for the overall status to be UP, all checks that are run on the endpoint must pass:

{
  "status": "UP",
  "checks": [
    {
      "name": "heap-memory",
      "status": "UP",
      "data": {
        "used": "1475462",
        "max": ”51681681"
      }
    }
  ]
}

The next example shows the @Readiness annotation, which checks for an available database connection. If the connection is successful, it returns an UP status. If the connection is unavailable, it returns a DOWN status:

@Readiness
@ApplicationScoped
public class DatabaseReadyCheck implements HealthCheck {

    @Override
    public HealthCheckResponse call() {

        if (isDBConnected()) {
           return HealthCheckResponse.up(“databaseReady”);
        }
        else {
           return HealthCheckResponse.down(“databaseReady”);
        }
    }
}

The following example shows the JSON response from the /health/ready endpoint if the database connection is unavailable. The first status indicates the overall status of all health checks returned from the endpoint. The second status indicates the status of the particular check that is specified by the preceding name value, which in this example is databaseReady. If any check that runs on the endpoint fails, the overall status is DOWN.

{
  "status": ”DOWN",
  "checks": [
    {
      "name": ”databaseReady",
      "status": ”DOWN",
    }
  ]
}

12.2.2. See also: