Chapter 16. Introduction: Monitoring and Responding to Resource Activity

One of the core functions of JBoss Operations Network is that it lets administrators stay aware of the state of their JBoss servers, platforms, and overall IT environment.
The current state of individual servers and applications provides critical information to IT staff about traffic and usage, equipment failures, and server performance. JBoss Operations Network can supply a clearer picture of these critical data by automatically monitoring resources in its inventory.
The most powerful aspect of management is the ability to know, accurately, where your resources are and to respond to that ever-changing situation reliably.

16.1. Monitoring and Types of Data

Monitoring gives insight into how a specific machine, application, or service is performing. JBoss ON collects different types of information from different native and external sources for its managed resources.
JBoss Operations Network is not a real-time monitor or an archive of data points or a profiler. What JBoss ON does is, in essence, filter and process raw data so that long-term trends, operating parameters, and performance histories — the purpose of monitoring — are clear and accessible from the data. JBoss ON uses schedules to define what information to gather and how frequently (anywhere from 30 seconds to hours). This prioritizes the performance information for a resource and makes important information more visible and coherent.
Although the precise information gathered is different depending on the resource type, there are a few broad categories of monitoring data. Each category obtains information from a different place and is useful to determine a different aspect of resource behavior.
Availability or "up and down" monitoring
This is both basic and critical. Availability is status information about the resource, whether it is running or stopped.
Numeric metrics
Metrics are the core performance data for a resource. Almost every software product exposes some sort of information about itself, some measurable facet that can be checked. This is usually This numeric information is collected by JBoss ON, on defined schedules.
Metric information is processed by the server. There are three states of the monitoring data used:
  • Raw data, which are the readings collected on schedule by the agent and sent to the server
  • Aggregated data, which is compressed data processed by the server into 1-hour, 6-hour, and 24-hour averages and used to calculate baselines and normal operating ranges for resources. These aggregated data are the information displayed in the monitoring graphs and returned in the CLI as metrics.
  • Live values, which are ad hoc requests for the current value of a metric.
    Metric values are rolling live-streams of the resource state; they are essentially snapshots that the agent takes of the readings on predefined schedules. Those data are then aggregated into means and averages to use to track resource performance.
    Live values are immediate, aggregated, current readings of a metric value.
Metric information is especially important because it is collected and stored long-term. This allows for historical views on resource performance, as well as recent views.
Logfile messages (events)
While JBoss ON is not a log viewer, it can monitor specified logs and check for important log messages based on severity or strings within the log messages. This is event monitoring, and it allows JBoss ON to identify incidents for a resource and to send an alert notification and, if necessary, take corrective action based on dynamic information outside normal metrics.
Response time metrics
Certain types of resources (URLs for web servers or session beans) depend on responsiveness as a component of overall performance. Response time or call-time data tracks how quickly the URL or session bean responds to client requests and helps determine that the overall application is performant.
Descriptive strings (traits)
Most resources have some relatively static information that describe the resource itself, such as an instance name, build date, or version number. This information is a trait. As with other attributes for a resource, this can be monitored. Traits are useful to identify changes to the underlying application, like a version update.