6. Configuring and Managing Alerts: Procedures

Monitoring (Section 2, “Monitoring Resources: An Introduction”) is the first step in a larger work flow that has the sole intent of keeping administrators aware of what is happening in their network. The next two steps involve:
  • Setting parameters for JBoss ON to trigger a warning (alerts)
  • Notifying administrators when an alert is tripped (notifications)

6.1. Setting Alerts for a Resource

NOTE

It is not possible to edit an alert condition or an alert notification after they are set. To change the conditions or notifications for an alert definition, delete the condition or notification and create a new one.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the resource name in the list.
  4. Click the Alerts tab for the resource.
  5. In the Definitions subtab, click the New button to create the new alert.
  6. In the General Properties tab, give the basic information about the alert.
    • Name. Gives the name of the specific alert definition. This must be unique for the resource.
    • Description. Contains an optional description of the alert; this can be very useful if you want to trigger different kinds of alert responses at different conditions for the same resource.
    • Priority. Sets the priority or severity that is given to an alert triggered by this definition.
    • Enabled. Sets whether the alert definition is active. Alert definitions can be disabled to prevent unnecessary or spurious alerts if there is, for instance, a network outage or routine maintenance window for the resource.
  7. In the Conditions tab, set the metric or issue that triggers the alert. Click the Add button to bring up the conditions form.

    TIP

    There can be more than one condition set to trigger an alert. For example, you may only want to receive a notification for a server if its CPU goes above 80% and its available memory drops below 25MB. The ALL setting for the conditions restricts the alert notification to only when both criteria are met. Alternatively, you may want to know when either one occurs so that you can immediately change the load balancing configuration for the network. In that case, the ANY setting fires off a notification as soon as even one condition threshold is met.
    1. Click the Add a new condition button.
    2. From the initial drop-down menu, select the type of condition. The categories of conditions are described in Table 3, “Types of Alert Conditions”, and the exact conditions available to be set for every resource are listed in the Resource Monitoring Reference.
    3. Set the values for the condition.
  8. In the Notifications tab, click Add to set a notification for the alert.
    1. Select the method to use to send the alert notification in the Sender option.
      The Sender option first sets the specific type of alert method (such as email or SNMP) and then opens the appropriate form to fill in the details for that specific method.
    2. Fill in the required information for the alert sender method. The method may require contact information, SNMP settings, operations, or scripts, depending on what is selected.
  9. In the Recovery tab, set whether to send a recovery alert and whether the alert is disabled until the resource state is recovered.
  10. In the Dampening tab, give the dampening (or frequency) rule on how often to send notifications for the same alert event.
    The frequency for sending alerts depends on the expected behavior of the resource. There has to be a balance between sending too many alerts and sending too few. There are several frequency settings:
    • Consecutive. Sends an alert if the condition occurs a certain number of times in a row for metric calculations. For example, if this is set to three, then the condition must be detected in three consecutive metric collection periods for the alert to be fired. If this is set to one, then it sends an alert every time the condition occurs.
    • Last N evaluations. This sets a number of times that the condition has to occur in a given number of monitoring evaluations cycles before an alert is sent.
    • Time period. The other two similar dampening rules set a recurrence based on the JBoss ON monitoring cycles. This sets the alerting rule based on a specific time period.
  11. Click OK to save the alert definition.

6.2. Extended Example: Ranges, AND, and OR Operators with Conditions

Alerting is based on monitoring information. It is an extension that allows an administrator to receive a notification or define an action to take if a certain event or metrics value occurs.
The monitoring point that triggers an alert is the alert condition. At its most simplistic, an alert condition is a single event or reading. If X occurs, then that triggers an alert.
In real life, X may not be enough to warrant an alert or to adequately describe the state of a resource. Different conditions may require the same response or a situation may only be critical if multiple conditions are true. Alerting is very flexible because it allows multiple conditions to be defined with established relationships between those conditions.
The next level of complexity is to send an alert if either X or Y is true. In the alert definition, this is the ANY option, which is a logical OR. The alert definition checks for any of those conditions, but those conditions are still unrelated to each other.
The last level of complexity is when the conditions have to relate to each other for an alert to be issued. This is the ALL option, which is a logical AND. Both X and Y must occur for the alert to be issued. In this case, when one condition occurs, the server puts a lock on that definition and begins waiting for the second condition to occur. When the second condition occurs, then the alert is issued.
An AND operator is very effective on different metrics, but because the conditions do not have to occur simultaneously, using a simple AND operator does not make sense for the same metric. For example, Tim the IT Guy only wants an alert to be issued when the user load is between 40% to 60%, indicating slightly increased loads on his platform. Attempting to use an AND operator returns strange values when the load spikes over 70% (which trips the above 40% condition) and then falls back to 15% (which triggers the below 60% condition).
In this case, Tim uses a range condition. A range requires two values from the same metric that are within the given boundaries. A range can be inside values (40-60%) or it can be an outside range (below 40% and above 60%).
Alert Condition Range

Figure 5. Alert Condition Range


6.3. Assigning an Operation to an Alert

To set an alert operation, select the Operations alert method when configuring notifications. Operations can perform tasks or run scripts on a target resource; this is detailed in Section 5.3, “Alert Operations” and correlates to using resource operations, as described in Section 7, “Operations: An Introduction”.

6.3.1. Using Tokens with Alert Operations

Alert operations can use tokens to either send information or supply information about the event. For example, tokens can be used to supply resource information in a command-line script.
Alert operations can accept tokens to fill in certain values automatically. These tokens have the following form:
<%space.param_name%>
The space gives the JBoss ON configuration area where the value is derived; this will commonly be either alert or resource. The param_name gives the entry value that is being supplied. For example, to point to the URL of the specific fired alert, the token would be <%alert.url%>, while to pull in the resource name, the token would be <%resource.name%>.
JBoss ON has pre-defined token values that relate to the fired alert, the resource which issued the alert, the resource which is the target of the operation, and the operation that was initiated. These are listed in Table 4, “Available Alert Operation Tokens”. All of these potential token values are Java properties the belong to the operation's parent JBoss ON server.
The alert operations plug-in resolves the token value itself when the alert operation is processed to find the value. The realized value is sent to the script service, which ultimately plugs the value into the command-line argument or script which referenced the token.

Table 4. Available Alert Operation Tokens

Information about ... Token Description
Fired Alert alert.willBeDisabled Will the alert definition be disabled after firing?
Fired Alert alert.id The id of this particular alert
Fired Alert alert.url Url to the alert details page
Fired Alert alert.name Name from the defining alert definition
Fired Alert alert.priority Priority of this alert
Fired Alert alert.description Description of this alert
Fired Alert alert.firedAt Time the alert fired
Fired Alert alert.conditions A text representation of the conditions that led to this alert
Alerting Resource resource.id ID of the resource
Alerting Resource resource.platformType Type of the platform the resource is on
Alerting Resource resource.platformName Name of the platform the resource is on
Alerting Resource resource.typeName Resource type name
Alerting Resource resource.name Name of the resource
Alerting Resource resource.platformId ID of the platform the resource is on
Alerting Resource resource.parentName Name of the parent resource
Alerting Resource resource.parentId ID of the parent resource
Alerting Resource resource.typeId Resource type id
Target Resource targetResource.parentId ID of the target's parent resource
Target Resource targetResource.platformName Name of the platform the target resource is on
Target Resource targetResource.platformId ID of the platform the target resource is on
Target Resource targetResource.parentName Name of the target's parent resource
Target Resource targetResource.typeId Resource type of the target resource id
Target Resource targetResource.platformType Type of the platform the target resource is on
Target Resource targetResource.name Name of the target resource
Target Resource targetResource.id ID of the target resource
Target Resource targetResource.typeName Resource type name of the target resource
Operation operation.id ID of the operation fired
Operation operation.name Name of the operation fired

6.3.2. Setting Alert Operations

  1. Configure the basic alert definition, as in Section 6.1, “Setting Alerts for a Resource”.
  2. In the Notifications tab for the alert definition, give the notification method a name, and select the Resource Operations method from the Alert Senders drop-down menu.
  3. First, set the resource that the operation will run on. The default is the resource that the alert is set for; it is also possible to set it on another specific resource or on the results of a search.

    IMPORTANT

    If you select a relative resource and do not enter a specific resource name, then the operation will run on every resource which matches that resource type in the relative path. If no resource matches, then it is logged into the audit trail and the alert process proceeds.
    For a relative resource, the resource name is not required. For a specific resource, it is.
  4. Select the operation type. The available operations and their configuration parameters depend on the type of resource selected as the target of the operation.
    The Resource Monitoring Reference lists the available operations per resource type. Section 7, “Operations: An Introduction” has more information on setting operations in general.
  5. Configure the parameters of the operation. The available settings depend on the type of operation selected.

6.4. Initiating Resource Scripts from an Alert

To set an alert operation, select the Resource Operations alert method when configuring notifications, with any required environment variables or arguments. This is the same as using an operation to execute a script, as described in Section 8.5, “Running Scripts as Operations for JBoss Servers”.

NOTE

The script must be uploaded to the resource and added into the JBoss ON inventory before it can be used in an alert operation.
  1. Import the script into the resource inventory where it should run in response to the alert. If necessary, run manual discovery to detect and add the script.
  2. Configure the basic alert definition, as in Section 6.1, “Setting Alerts for a Resource”.
  3. In the Notifications tab for the alert definition, give the notification method a name, and select the Resource Operations method from the Alert Senders drop-down menu.
  4. Select the script resource that will be run in response to the alert.

    IMPORTANT

    If you select a relative resource and do not enter a specific script name in the name filter field, then the operation will run on every script resource that is in the relative path with the command arguments that are given. If no script matches, then it is logged into the audit trail and the alert process proceeds.
    For a relative resource, the resource name is not required. For a specific resource, it is. To limit script execution to a single specific script, select the specific resource option and select the precise script from the list.
  5. Set what operation to perform with the script and, optionally, any command-line arguments to pass to the script.

6.5. Launching JBoss ON CLI Scripts from an Alert

JBoss ON has its own command-line client that can be used to manage server instances in the same way that the web UI manages servers. Much like running a script resource or launching an operation in response to an alert condition, a server CLI script can be run in response to an alert condition.

NOTE

For server CLI scripts, the scripts must be uploaded to the server as content within a repository before it can be run.
The CLI script must use the proper API to perform the operation on the server. JBoss ON has several different API sets, depending on the task being performed. To connect to a server and run a script requires the remoting API, which allows commands to be executed on the server remotely. Writing CLI scripts is covered more in Running JBoss ON Command-Line Scripts.
  1. Create a script which is relevant to the alert. Commands, options, and variables for the JBoss ON CLI are listed in Running JBoss ON Command-Line Scripts.
    An example alert script is included in the server files, in serverInstallDir/alert-scripts/.

    TIP

    The CLI script can actually reference an alert object for the alert which triggers the script by using a pre-defined alert variable.
    For example, this script checks the recent monitoring statistics for a web application and restarts the web server database if there are connection problems:
    var myResource = ProxyFactory.getResource(alert.alertDefinition.resource.id)
    
    var definitionCriteria = new MeasurementDefinitionCriteria()
    definitionCriteria.addFilterDisplayName('Sessions created per Minute')
    definitionCriteria.addFilterResourceTypeId(myResource.resourceType.id)
    
    var definitions = MeasumentDefinitionManager.findMeasurementDefinitionsByCriteria(definitionCriteria)
    
    if (definitions.empty) {
       throw new java.lang.Exception("Could not get 'Sessions created per Minute' metric on resource "
          + myResource.id)
    }
    
    var definition = definitions.get(0)
    
    var startDate = new Date() - 8 * 3600 * 1000 //8 hrs in milliseconds
    var endDate = new Date()
    
    var data = MeasurementDataManager.findDataForResource(myResource.id, [ definition.id ], startDate, endDate, 60)
    
    exporter.setTarget('csv', '/the/output/folder/for/my/metrics/' + endDate + '.csv')
    
    exporter.write(data.get(0))
    
    var dataSource = ProxyFactory.getResource(10411)
    
    connectionTest = dataSource.testConnection()
    
    if (connectionTest == null || connectionTest.get('result').booleanValue == false) {
        //ok, this means we had problems connecting to the database
        //let's suppose there's an executable bash script somewhere on the server that
        //the admins use to restart the database
        java.lang.Runtime.getRuntime().exec('/somewhere/on/the/server/restart-database.sh')
    }
  2. Upload the script to a content repository.

    TIP

    Create a separate repository for alert CLI scripts.
  3. Search for the resource, and configure the basic alert definition, as in Section 6.1, “Setting Alerts for a Resource”.
  4. In the Notifications tab for the alert definition, give the notification method a name, and select the CLI Script method from the Alert Senders drop-down menu.
  5. First, select the JBoss ON user as whom to run the script. The default is as the user who is creating the notification.
  6. Select the repository which contains the CLI script. If you are uploading a new script, this is the repository to which the script will be added.
  7. Select the CLI script to use from the drop-down menu, which lists all of the scripts in the specified repository. Alternatively, click the Upload button to browse to a script on the local machine.
  8. Click OK to save the notification. The line in the Notifications tab shows the script, the repository, and the user as whom it will run.

6.6. Configuring SNMP for Notifications

Configuring JBoss ON to send SNMP alerts has two parts:
  • Configuring the SNMP alert plug-in for the server.
  • Configuring the actual alert with an SNMP notification.

6.6.1. JBoss ON SNMP Information

JBoss ON can send SNMP traps to other management stations and systems as part of alerting notifications. The data transmitted contain details about the alert, such as the name of the alert that was triggered and the resource name.
The data to include in the traps, as with other SNMP notifications, are defined in the JBoss ON MIB file, in serverRoot/etc/RHQ-mib.txt. The default configuration for the MIB is shown in Example 1, “Default Alert Object in JBoss ON MIB”. The base OID for the JBoss ON alert is 1.3.6.1.4.1.18016.2.1 (org.dod.internet.private.enterprise.jboss.rhq.alert).

Example 1. Default Alert Object in JBoss ON MIB

alertGroup OBJECT-GROUP
    OBJECTS {   alertName,
                alertResourceName,
                alertPlatformName,
                alertCondition,
                alertSeverity,
                alertUrl }
    STATUS  current
    DESCRIPTION "A collection of objects providing information about an alert"

With the default MIB file, each trap sends the alert definition name, resource name, platform, alert conditions, severity, and a URL to the alert details page.

6.6.2. Configuring the SNMP Alert Plug-in

The SNMP alert sender plug-in is the only alert notification plug-in that requires additional configuration before the notification method can be used. The SNMP plug-in has to be configured with the appropriate SNMP version and SNMP agent information.
  1. In the top menu, select the Administration tab.
  2. In the System Configuration menu, select the Plugins item.
  3. Open the Server Plugins tab, and click the name of the SNMP plug-in in the list.
  4. In the plug-in details page, click the Configure 'Alert:SNMP' link to open the configuration page for the plug-in.
  5. Click the EDIT button at the bottom of the configuration screen to make the fields active.
  6. All SNMP versions require information about the JBoss ON MIB OID and selected version. Fill in the appropriate values.
  7. SNMP version 1 and version 3 both require additional configuration. Expand the version-specific configuration section and fill in the information about the SNMP agent.
    It may be necessary to unselect the Unset checkbox to allow the fields to be edited.

6.6.3. Configuring the SNMP Alert Notification

Before JBoss ON can send any SNMP notifications, SNMP traps have to be configured for the server.
  1. Configure the basic alert conditions and information for the resource, as described in Section 6.1, “Setting Alerts for a Resource”. Click OK to go to the next page to configure notifications.
  2. In the Notifications tab for the alert definition, give the notification method a name, and select the SNMP Trap method from the Alert Senders drop-down menu.
  3. Fill in the information for the SNMP trap.
    • The hostname for the SNMP manager.
    • The port number for the SNMP manager. JBoss ON supports UDP, so this must be the UDP port.
    • The JBoss ON OID. This is 1.3.6.1.4.1.18016.2.1.

6.7. Sending Alerts Based on Call-Time Data

Certain resource types deliver call time or response time data. This information contains pre-aggregated measurements for the maximum, minimum, or average results for the responses. Resources which collect call time data can use that pre-processed information as the basis for alert notifications, the same as other monitoring data.
Two types of resources support call-time data:
  • Session bean methods
  • Web servers with response time monitoring configured
To configure call-time data alerts:
  1. Configure the basic alert definition, as in Section 6.1, “Setting Alerts for a Resource”.
  2. In the Conditions tab for the alert definition, click Add to add a monitoring condition.
  3. Select one of the call-time data options from the Condition Type list. Call-time changes will trigger an alert for any change from the established baseline. Call-time thresholds trigger an alert if the call-time data moves past the given level or hits a certain value, regardless of what kind of that change is.
  4. Fill in the information about the call-time data to alert on. Call-time data are pre-aggregated (processed) in one of three ways: maximum, minimum, and average measurements. The Call Time Limit value sets which of the pre-aggregated measurements is being monitored for the alert.
  5. Complete the alert configuration by setting notification methods, recovery, and dampening settings.

6.8. Enabling and Disabling Alert Definitions

When an alert definition is disabled, no alert notifications are triggered for that set of conditions. Disabling definitions is very useful when resources are being taken offline for a know reason (such as upgrades or maintenance) and any alerts triggered during that time would be wrong. Alert definitions can be re-enabled later just as easily.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Alerts tab.
  4. In the Definitions subtab, select any of the definitions to enable or disable.
  5. Click the Enable or Disable button.
  6. Confirm the action.

6.9. Viewing the Alert Definitions Report

While the alert definitions for a specific resource are always available by viewing that resource entry, it is also possible to view all of the alert definitions configured in JBoss ON in the Alert Definitions Report.
  1. Select the Reports tab in the top navigation bar.
  2. In the Subsystems menu box on the left, select Alert Definitions.
  3. The definitions report shows a list of all configured definitions, for all resources in the inventory.
    The results table provide the most basic information for the definitions:
    • The resource (Name).
    • The parent or ancestry. Since resources are arranged hierarchically, sorting by the parent is very useful for finding all alert definitions for all services and applications that relate to a high-level resource like a server.
    • The description of the alert.
    • Whether it is active (enabled).

NOTE

A user may have the write to create and edit an alert definition, but that does not mean that the user has the right to delete an item from the alert history.
Deleting elements in the history requires the manage inventory permission.

6.10. Using Alerting Templates and Group Alerts

Templates make configuration really easy to apply consistently and often, and JBoss ON allows templates to be set for alerts based on their general resource type.
Group alerts, like alert templates, apply equally to every member of a compatible group. Group alerts offer more control over which resources have the alert definition, however, since resources can be manually added to the group or selected based on a search filter. When a resource joins or leaves a group, its alert definitions are automatically updated.

6.10.1. Creating Alert Definition Templates

Alert templates are fully defined alert definitions — from conditions to notification methods — that are created for any of the managed resource types in  JBoss ON. Servers or applications of the same type will probably have the same set of alert conditions that apply, such as free memory or CPU usage. An alert definition template creates an alert based on the general type of resource. So, there can be alert templates for Windows, Linux, and Solaris servers, Tomcat and Apache servers, and services like sshd and cron. Every time a resource of that type is added, then the alert definition is automatically added to the resource with the predefined settings. Any alert assigned to a resource through a template can be edited locally for that resource, so these alert definitions are still flexible and customizable.
To create an alert definition template:
  1. In the top navigation, open the Administration menu, and then the System Configuration menu.
  2. Select the Alert Templates menu item. This opens a long list of resource types, both for platforms and server types.
  3. Locate the type of resource for which to create the template definition.
  4. Click the New button to create a global alert definition. Set up the alert exactly the same way as setting an alert for a single resource (as in Section 6.1, “Setting Alerts for a Resource”).
  5. Save the template.
The template definition is then applied to all current and new resources of that type.

6.10.2. Configuring Group Alerts

Group alerts can only be set on compatible groups.
  1. In the Inventory tab in the top menu, select the Compatible Groups item in the Groups menu on the left.
  2. In the main window, select the group to add the alert to.
  3. Click the Alerts tab for the group.
  4. In the Definitions subtab, click the New button.
  5. Configure the basic alert definition and notifications, as in Section 6.1, “Setting Alerts for a Resource”.

6.11. Viewing Alerts

The alert history can be reviewed for a resource, a group of resources, a parent, or the whole JBoss ON server.

6.11.1. Viewing Alert Details for a Specific Resource

NOTE

A user may have the write to create and edit an alert definition, but that does not mean that the user has the right to delete an item from the alert history.
Deleting elements in the history requires the manage inventory permission.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the resource in the list.
  4. Click the Alerts tab, and make sure that the History subtab is selected.
  5. In the list, click the timestamp or alert definition name for the fired alert.
  6. The alert page has tabs for each detail for the alert, including which alert definition was triggered, the conditions that triggered, and any operations that were launched as a result.

6.11.2. Viewing the Fired Alerts Report

  1. Select the Reports tab in the top navigation bar.
  2. In the Subsystems menu box on the left, select Recent Alerts.
All of the alerts for all resources in JBoss ON are listed in the results table. Several results elements are useful for analysis:
  1. The resource (Name)
  2. The parent (ancestor)
  3. The name of the definition which triggered the alert
  4. The condition which triggered the alert
  5. The value of the resource at the time the alert was sent
  6. The date, which is very useful for correlating the alert notification to an external event

NOTE

A user may have the write to create and edit an alert definition, but that does not mean that the user has the right to delete an item from the alert history.
Deleting elements in the history requires the manage inventory permission.

6.11.3. Viewing Alerts in the Dashboard

All of the recently-fired alerts, by default, are listed on the Dashboard page of JBoss ON in the recent alerts portlet.
Recent Alerts Portlet

Figure 6. Recent Alerts Portlet


The alerts displayed in the portlet can be filtered for three conditions:
  1. A time range for when the alert was fired
  2. The alert priority (which is initially configured in the alert definition)
These conditions are evaluated in order, meaning that alerts are filtered first based on time, then priority.
To set the conditions for the alerts portlet in the Dashboard page:
  1. In the top menu, click Dashboard.
  2. In the Recent Alerts portlet, click the gear icon to open the portlet configuration page.
  3. Change the display criteria as desired.

6.12. Acknowledging an Alert

Acknowledging an alert is a way of identifying that the condition which triggered the alert has been addressed in some way. When an alert is acknowledged, the name of the user who acknowledged the alert is recorded. Recording the acknowledger's name allows the action to be audited later if necessary.
There are several different ways to acknowledge an alert:
  • Through the Recent Alerts Report
  • Through a group
  • Through the resource entry
Using the Recent Alerts Report is useful because you can acknowledge multiple alerts at the same time and for multiple resource types, which could be simpler if a known outage triggered many alerts. Acknowledging an alert is not a requirement to close the alert, but it can be useful as part of auditing an incident response or making sure that issues have been addressed.
  1. Select the Reports tab in the top navigation bar.
  2. In the Subsystems menu box on the left, select Recent Alerts.
  3. Select the alert to acknowledge.
  4. Click the Acknowledge button, and, when prompted, confirm the action.

NOTE

It is also possible to acknowledge a single alert through the alert details page.
When the alert is acknowledged, the Status shows the name of the user who acknowledged (and presumably resolved) the alert.