Red Hat Training

A Red Hat training course is available for Red Hat JBoss Operations Network

2. Introduction: Monitoring and Responding to Resource Activity

One of the core functions of JBoss Operations Network is that it lets administrators stay aware of the state of their JBoss servers, platforms, and overall IT environment.
The current state of individual servers and applications provides critical information to IT staff about traffic and usage, equipment failures, and server performance. JBoss Operations Network can supply a clearer picture of these critical data by automatically monitoring resources in its inventory.
The most powerful aspect of management is the ability to know, accurately, where your resources are and to respond to that ever-changing situation reliably.

2.1. Monitoring and Types of Data

Monitoring gives insight into how a specific machine, application, or service is performing. JBoss ON collects different types of information from different native and external sources for its managed resources.
JBoss Operations Network is not a real-time monitor, and it is not an archive of data points. JBoss ON is not a profiler. What JBoss ON does is, in essence, filter and process raw data so that long-term trends, operating parameters, and performance histories — the purpose of monitoring — are clear and accessible from the data. JBoss ON uses schedules to define what information to gather and how frequently (anywhere from 30 seconds to hours). This prioritizes the performance information for a resource and makes important information more visible and coherent.
Although the precise information gathered is different depending on the resource type, there are a few broad categories of monitoring data. Each category obtains information from a different place and is useful to determine a different aspect of resource behavior.
Availability or "up and down" monitoring
This is both basic and critical. Availability is status information about the resource, whether it is running or stopped.
Numeric metrics
Metrics are the core performance data for a resource. Almost every software product exposes some sort of information about itself, some measurable facet that can be checked. This is usually This numeric information is collected by JBoss ON, on defined schedules.
Metric information is processed by the server. There are three states of the monitoring data used:
  • Raw data, which are the readings collected on schedule by the agent and sent to the server
  • Aggregated data, which is compressed data processed by the server into 1-hour, 6-hour, and 24-hour averages and used to calculate baselines and normal operating ranges for resources. These aggregated data are the information displayed in the monitoring graphs and returned in the CLI as metrics.
  • Live values, which are ad hoc requests for the current value of a metric.
    Metric values are rolling live-streams of the resource state; they are essentially snapshots that the agent takes of the readings on predefined schedules. Those data are then aggregated into means and averages to use to track resource performance.
    Live values are immediate, aggregated, current readings of a metric value.
Metric information is especially important because it is collected and stored long-term. This allows for historical views on resource performance, as well as recent views.
Logfile messages (events)
While JBoss ON is not a log viewer, it can monitor specified logs and check for important log messages based on severity or strings within the log messages. This is event monitoring, and it allows JBoss ON to identify incidents for a resource and to send an alert notification and, if necessary, take corrective action based on dynamic information outside normal metrics.
Response time metrics
Certain types of resources (URLs for web servers or session beans) depend on responsiveness as a component of overall performance. Response time or call-time data tracks how quickly the URL or session bean responds to client requests and helps determine that the overall application is performant.
Descriptive strings (traits)
Most resources have some relatively static information that describe the resource itself, such as an instance name, build date, or version number. This information is a trait. As with other attributes for a resource, this can be monitored. Traits are useful to identify changes to the underlying application, like a version update.

2.2. Alerts and Responses to Changing Conditions

A critical part of monitoring is being aware of when undesirable events occur. Alerting works with other functions in JBoss ON management (monitoring data and configuration drift detection) to define conditions for triggering an alert.
When an alert condition is met, alerting in JBoss ON serves two important functions:
  • Alerts communicate that there has been a problem, based on parameters defined by an administrator.
  • Alerts respond to incidents automatically. Administrators can automatically initiate an operation, run a JBoss ON CLI script to change JBoss ON or resource configuration, redeploy content, or run a shell script, all in response to an alert condition.
    Automatic, administrator-defined responses to alerts make it significantly easier for administrators to address infrastructure problems quickly, and can mitigate the effect of outages.
Alerts are based on metrics information, call-time data, availability, and events, all normal monitoring elements. Alerting can also be based on critical changes to a resource, defined in drift definitions that track configuration drift. Tracking configuration for resources along with monitoring data lets administrators remedy unplanned or undesirable system changes easily and consistently.

2.3. Potential Impact on Server Performance

Theoretically, there is no limit to the number of metrics that can collected or the number of alerts that can be fired.
In reality, there are natural constraints within the IT environment that limit both monitoring and alert settings:
  • Database performance, which is the primary factor in most environments
  • Network bandwidth
There are no hard limits on JBoss ON's alerting and monitoring configuration since it depends on the number of resources, number of metrics, collection frequency, and the number of alerts.
As a rule of thumb, there are these performance thresholds:
  • Up to 30,000 metrics can be collected per minute
  • Up to 100,000 alerts can be fired per day (roughly 70 per minute)
Plan how to implement metrics collection and alerting. Prioritize resources and then the information required from those resources when enabling metrics schedules and setting collection frequencies. Then, based on those priorities, plan what alerts are required.
Clear monitoring and alerting strategies can help maintain performance while still gathering critical information.

2.4. Differences with Monitoring Based on Different Resource Types

Available metrics, events, traits, and other monitoring settings are defined for each resource type in its plug-in descriptor.
Obviously, software of completely different types have different possible monitoring configuration.
However, monitoring settings can be different between releases of the same software. Either different metrics are available or the same metric may have different configuration names. For example, JBoss EAP 4 and 5 have the same metrics, related to monitoring the EAP server JVM, threads, and transactions. Because of the different management structure in JBoss EAP 6, there are different metrics, related to management requests between the servers in the EAP 6 domain.
The Resource Reference: Monitoring, Operation, and Configuration Options has a complete references of available metrics for the official JBoss ON agent plug-ins. Check this guide to see what differences there are between release versions.