Using JBoss Operations Network for Monitoring, Deploying, and Managing Resources

Red Hat JBoss Operations Network 3.3

Recommendations and Procedures for Maintaining an Efficient JBoss and IT Infrastructure

Jared Morgan

Zach Rhoads

Abstract

The primary function of JBoss Operations Network is monitoring the status of your resources. The core of monitoring includes critical availability monitoring, collecting metrics on platform and server performance, and tracking events. JBoss ON also provides a way to define alerts and then notify administrators whenever a resource is performing poorly.
This guide provides GUI-based procedures to view monitoring information, to track events, to define alerts and notifications, and to initiate operations.

Chapter 1. Using the JBoss ON Web Interface

JBoss Operations Network has a rich, layered UI which covers a broad range of functionality. This chapters gives a brief summary of the major sections of the UI so that users can more effectively perform management tasks.

1.1. Supported Web Browsers

JBoss ON supports these browser releases for accessing the server GUI:
  • Firefox 17 or later
  • Internet Explorer 9

1.2. Logging into the JBoss ON Web UI

Aside from some minor configuration in its rhq-server.properties file, JBoss ON is completely administered through its web interface.
By default, the JBoss ON server listens over port 7080. (A different port can be configured when the server is installed, and the port number can be changed in the server configuration.) To connect to the server, open a standard HTTP page with a URL in the format hostname:port. For example:
http://server.example.com:7080
Then, log in using any valid username/password combination. The default administrative user has the name and password rhqadmin.

Figure 1.1. Logging into JBoss ON

Logging into JBoss ON

1.3. Configuring Internet Explorer

Some Internet Explorer settings can prevent the JBoss ON login page from loading properly. By default, Internet Explorer is in stealth mode, which disables some JavaScript access for websites. To allow the login page to load, add the IP address of the JBoss ON server to the whitelist for Internet Explorer.
  1. In Internet Explorer, click the gear icon in the upper right corner and select Internet options.
  2. Open the Security tab, and select the Local intranet icon.
  3. Click the Sites button.
  4. Click the Advanced button at the bottom of the pop-up window.
  5. Enter the JBoss ON server hostname or IP address in the Add this webiste to the zone: field, and click the Add.
  6. Close out the options windows.

1.4. A High Level Walk-Through

The JBoss ON UI is very rich — there are a lot of small elements that are all layered together to provide a very detailed and flexible interface for interacting with the JBoss ON servers and resources. To maximize its use of space, JBoss ON uses top navigation menus, tabbed browsing with subtabs, active links, and navigation trees to establish relationships between JBoss ON resources and JBoss ON functionality. In a very general view, several types of visual elements that work together to comprise the UI:
  • The top menu
  • The left menu tables
  • The dashboard
  • Resource-based tables, which can be for the resource inventory, a summary report, or the results of a search
  • Configuration pages which both provide details for and access to elements in JBoss ON, including resources, groups, plug-ins, and JBoss ON server settings
All of these elements fit together in a repeated and reliable pattern.

Figure 1.2. UI Elements All Together

UI Elements All Together
Understanding the type of page that you are viewing can make it easier to navigate through the JBoss ON UI and can help you more completely understand what you can accomplish in JBoss ON.

1.4.1. The Top Menu

At the very top of the JBoss ON UI is a menu bar with, with five tabs that go to the major configuration areas of JBoss ON.

Figure 1.3. The Top Menu

The Top Menu
Each menu item relates to a different functional aspect of JBoss ON.
  • The Dashboard contains a global overview of JBoss ON and its resources. Different, configurable snapshot summaries (called portlets) show different aspects of the resources and server, such as the discovery queue, recent alerts, recent operations, and resource counts.
  • The Inventory tab shows both resources and groups.
  • The Reports tab shows pre-defined reports. These are slightly different from the Dashboard, which focuses exclusively on resource information: the reports focus on the current actions of the different subsystem (or major functional areas) of JBoss ON, such as alerts, operations, metric collection, and configuration history.
  • The Bundles tab opens the provisioning and content functional area. This is for uploading and deploying content bundles that are used to provision new applications.
  • Administration goes to all areas related to configuring the JBoss ON server itself. This includes server settings, plug-ins, users and security, and agent settings.

1.4.2. The Left Menu

Rather than using drop-down or tabbed options, much of the configuration for JBoss ON is accessed through the left menu. There are individual tables that contain related areas of configuration, like users and groups, server configuration, server/agent connections, and content for the Administration area in Figure 1.4, “The Left Menu”.

Figure 1.4. The Left Menu

The Left Menu
Clicking the up or down arrows at the left of the menu tables collapses and expands the tables. This can make it easier to navigate the left menu.

Figure 1.5. Collapsing the Left Menu

Collapsing the Left Menu

1.4.3. Dashboard

The Dashboard is an overview of everything in JBoss ON, from recent actions (fired alerts and operations) to availability reports to newly discovered and imported resources. This page, unlike any other area of JBoss ON, is customizable so it can be used to display only the collection of information that you want to see. Each table of information is called a portlet, a mini-portal into a view of the JBoss ON server or resources. There can be multiple Dashboard views configured, with different portlets or different layouts; these are accessed by tabs at the top of the Dashboard page.

Figure 1.6. Dashboard View

Dashboard View
The Dashboard is the landing page for JBoss ON, the first page that comes up after login.

1.4.4. Inventory Browsers and Summaries

Some pages are essentially long tables of information, presented in basically the same way:
  • Tabs for different areas, with subtabs that further break down information
  • A table of results
  • Icons that open a configuration or task option for that specific entity
  • Buttons that perform actions (create, delete, or some other specific action) on the entries; some of these buttons aren't active unless an entry is selected
The inventory interface in Figure 1.7, “Inventory Browser” is rich with functionality. The search bar for resources and groups uses a specialized syntax and flexible dynamic search. Hovering over any resource name gives a small popup message with more information about that resource. Clicking the name of the entry itself opens its default entry configuration page, while clicking the name of its parent opens up that parent resource's configuration page.

Figure 1.7. Inventory Browser

Inventory Browser
In Figure 1.7, “Inventory Browser”, the UNINVENTORY SELECTED button is active because a resource is selected. If no entries are actively selected, activity buttons are grayed out.

1.4.5. Entry Details Pages

Possibly the most functionality-saturated area in JBoss ON is an entry's details page.
The left navigation area shows the hierarchy, both parents and children, of the selected resource. This makes it very easy to navigate among all of the different services and servers that affect a resource.

Figure 1.8. Resource Tree

Resource Tree
Right-clicking any of the resources in the left navigation opens shortcuts to that entry's configuration.
Note
Resources have short names that are automatically assigned based on their type, instance or system name, or IP address. These names are used in the inventory and in the tree navigation.
The configuration area of a resource entry page (and other JBoss ON entities, like plug-ins and templates) has three information layers that provide all of the possible functionality and tasks available for that entry.
The entry's configuration page is tabbed according to each area that can be configured, and frequently has subtabs for additional configuration options and to show the history of that area (like fired alerts, previous content updates, or monitoring data).

Figure 1.9. Tabs for a Resource Entry

Tabs for a Resource Entry
The next section in the entry area shows lists of related configured entries for that task. For example, an Operations area will have a list of available operations in a table below the tabs. For Inventory, there is a list of configured child resources, while Alerts shows all of the configured alerts for that resource. All of those entries are listed in a table similar to the search results available in other parts of the JBoss ON UI.
Many elements are both a details page and an edit page. meaning that many fields are active automatically. This makes it possible to perform management tasks directly, without opening a separate page.

Figure 1.10. Editable Areas for a Resource Entry

Editable Areas for a Resource Entry

1.4.6. Shortcuts in the UI

To the far right of the top menu is a small cluster of icons that provide very quick, targeted insight into JBoss ON.
  • The Message Center shows all notifications that have been sent by the JBoss ON server. This includes alerts, configuration changes, changes to the inventory, or error messages for the server or UI.
  • The Favorites button can be used to navigate to selected resources and groups quickly, while the little blue ribbon on resource pages can be used to add that resource to the favorites list.
  • The resource availability is shown as a green check mark if the resource is available and a red X if the resource is down.

Figure 1.11. Shortcuts

Shortcuts

1.4.7. Red Hat Access Menu

With Red Hat Access, customers can search help topics, submit support tickets, and utilize diagnostic services using exclusive Red Hat knowledge, resources, and functionality all from within the JBoss ON UI.
Subscribers can enjoy the following Red Hat Access services:
  • Conveniently access exclusive Red Hat knowledge and solutions.
  • Search error codes, messages, and other information, as well as view related knowledge from the Red Hat Customer Portal.
  • Create new and view existing support cases as well as attach JBoss Diagnostic Reporter (JDR) reports collected from JBoss Enterprise Application Platform 6 instances and other files to those cases

1.4.7.1. Basic Usage

To start using Red Hat Access, on the dropdown menu in the upper right corner of the top menu titled "Red Hat Access". Subscribers must log in using their Red Hat Customer Portal credentials to enable these services. The "Red Hat Access" menu will detect when users are not logged in and present them with a Red Hat Access login dialog when appropriate.
For account/password recovery, visit the Login Assistance page on the Customer Portal.

1.4.7.3. Support

Case management functions are launched from the "Support" sub-menu of the "Red Hat Access" menu. These functions allow subscribers to view and update existing cases in addition to creating new cases directly from the JBoss ON UI.

1.4.7.4. Opening a New Support Case

A new case screen can be launched from "Red Hat Access -> Support -> New Case" sub-menu.
Alternatively, this screen can be opened by clicking on one of the "Open A New Support Case" button from the case list screens. If coming from the search panel, the most recent searched for text will be inserted into the description field as well as the appropreate product and version information. Additionally, Red Hat Access will automatically analyze the "Subject" and "Description" input fields and try to recommend Knowledgebase solutions that are relevant to the case being created.
The user may attach any relevant log files to the case during case creation. Clicking the checkbox next to a file in the Server file(s), then clicking "Update Attachments" will attached and upload the selected files.
Note
In addition to JBoss ON, this process may be used to open a support case against other Red Hat supported products as well.

1.4.7.5. Opening a New Support Case Against a Product on Supported Application Servers

A new case against a product on supported application servers can be started simply by navigating to the instance, right clicking and selecting "Open Support Case".
A case may then be opened against that product with the option of attaching JDR reports where avaiable.

1.4.7.6. View Existing Support Cases

Existing cases can be viewed from the "Red Hat Access -> Support -> My Cases" sub-menu.
The case list can be filtered by using the search box, selecting a case group, or by using the dropdown to control visibility to open and/or closed cases.

1.4.7.7. Editing a Case

Clicking on a case from the case list allows you to modify the case just as you would in the Red Hat Customer Portal. Click the "Update Details" button to make any changes in that section permanent.
In addition to recommending Knowledgebase solutions during case creation, Red Hat Access will also analyze the case's metadata and try to recommend relevant Knowledgebase solutions. The results are based on Red Hat's prior experience with similar issues which may help a customer resolve the issue more quickly. These recommended Knowledgebase solutions are shown when an existing case is being viewed or edited.
Any of the recommended Knowledgebase solutions can also be attached to an existing case by clicking on the pin icon next to the recommendation.

1.5. Getting Notifications in the Message Center

The Message Center shows all of the messages that have been returned by the GUI for the current browser session. This includes any actions taken in the UI — like adding resources to the inventory, configuring resources, or uploading content — and it also includes any error messages that may have been returned during the session.

Figure 1.12. Message Center

Message Center

1.6. Sorting and Changing Table Displays

Almost all of the information in JBoss ON is displayed in tables, from the resource inventory to the list of plug-ins for the agent. The SmartGWT UI has some versatility in how that table information is sorted and displayed.
A few tables use a very simple ascending/descending order based on the column being sorted, either numerically or alphabetically.

Figure 1.13. Basic Table Sorting on the Partition Events List

Basic Table Sorting on the Partition Events List
Most areas in the UI allow a more complex method of displaying information. As with basic tables, simply clicking a column name will sort that column in ascending/descending order. However, advanced GWT tables also have an option to change the table layout and sort options, by clicking a menu arrow at the right of the column.

Figure 1.14. Basic Table Sorting on the Server Resources List

Basic Table Sorting on the Server Resources List
When a menu arrow is selected, the sort order for that column can be changed, or any other column. You can also change the column sizing and even the types of columns displayed. The options are generated dynamically, depending on what kind of entry is contained in the table.

Figure 1.15. Advanced Table Sorting on the Server Resources List

Advanced Table Sorting on the Server Resources List
The sort order can even be prioritized by specifying multiple criteria. For example, resources can be sorted by name, then by plug-in, then by ID. Since resources have standardized names, sorting by name or parent alone may not be specific enough to give a meaningful order to the entries; providing multiple, prioritized criteria can make the table display more accurate.

Figure 1.16. Changing the Sort Method

Changing the Sort Method

1.7. Customizing the Dashboard

The Dashboard is configurable. It is composed of individual portlets, and these portlets can be rearranged or independently refreshed through the icon menu displayed on each portlet. There can even be multiple Dashboards, with different portlets, which can be used to give different and specific views into JBoss ON and its resources.
Note
Dashboards are configured per user, not globally.

1.7.1. Editing Portlets

To move portlets within the Dashboard layout, use the arrows in the portlet tool bar. To get rid of a portlet in a current, click the minimize icon on the far left to collapse it or click the X icon on the far right to delete the portlet from the Dashboard entirely. Some types of portlet allow customization, which can be accessed by clicking the wrench icon.

Figure 1.17. Portlet Icons

Portlet Icons

1.7.2. Adding and Editing Dashboards

The Dashboard page can actually contain multiple Dashboards, each with different portlets, column layouts, and refresh intervals. This makes it possible to get a logical grouping of information for a very fast assessment of the state of resources in JBoss ON. When multiple Dashboards are configured, they are displayed as tabs in the UI.

Figure 1.18. Tabbed Dashboards

Tabbed Dashboards
To add a new Dashboard:
  1. Click the New Dashboard button in the far right of the main Dashboard.
    Note
    The process of editing and adding Dashboards is very similar. The only difference is that to edit a Dashboard, you click the Edit Mode button.
  2. The new Dashboard opens in the edit mode. Enter a name for the new Dashboard.
  3. Add the desired portlets to the Dashboard. If necessary, change the number of columns to fit the number of portlets.

1.8. Setting Favorites

Using favorites makes it easy to navigate to resources that administrators need to access routinely for configuration updates, monitoring, or alerting.
Each resource has a small ribbon icon in the upper right corner of its details page. Clicking that icon automatically adds it to the resource favorites list.

Figure 1.19. Favorites Icon

Favorites Icon
The resource and group favorites are listed in the Favorites in the shortcuts on the right of the top menu. Clicking a resource on that list automatically opens its details page without having to search for the resource. Because multiple resources may share a name or some properties, the Favorites list includes a hover with more details about the resource so you can select the right one.

Figure 1.20. Favorites List

Favorites List

1.9. Deleting Entries

Resource-related entries can be deleted through the inventory browser or group browser. Most JBoss ON server configuration entries cannot be deleted. Only user-supplied elements, like plug-ins, content, roles, and users, can be deleted.
If an item can be deleted, then a delete button is available in the table list or details page for that item.

Figure 1.21. Delete Button in the Area Browser

Delete Button in the Area Browser
Note
A user may have the right to change something, but that does not implicitly grant the right to delete something. For example, users with the configuration write permission can edit resource configuration and view configuration history and settings, but they cannot delete elements in the configuration history. Similar constraints are true for users with permission to create and edit operations and alerts — there is no right to delete elements in the resource history.
Deleting elements in the history requires the manage inventory permission.

Chapter 2. Dynamic Searches for Resources and Groups

The inventory area has a dynamic search to look for resources and groups.
The dynamic search is an additional tool that can help manage your JBoss ON resources. Dynamic searches in JBoss ON can be saved to provide fast and reproducible snapshots of your JBoss ON deployment that match criteria that are relevant to your infrastructure, a kind of quick report.
A dynamic search checks both resources and groups (recursively into group members, as well) much more effectively than either a subsystem views search or a quick search. A search can begin against a specific identifying attribute of a resource (such as its name, parent, type, or JBoss ON category) and then has rules that can set how the search handles the string. Multiple search parameters can be strung together to make precise and complex searches. Dynamic searches can be saved and reused later so their results are reliably reproducible. (Section 2.2, “About the Dynamic Search Syntax” covers the details more.)
There are other aspects of dynamic searches like the autocomplete, hints, and highlight search strings that make it easier to use effectively than the limited substring and quick searches. These are covered in Section 2.1, “About Search Suggestions”.

2.1. About Search Suggestions

Dynamic searches are extremely powerful, past simply finding resources. Dynamic searches can run through values in a number of different resource traits, not only the resource name. Dynamic searches can even be saved, so they're repeatable and can be used as ad hoc reporting.
Dynamic searches are easy to use because of search suggestions. A drop-down menu for every search provides three different types of suggestions:
  • Saved searches, which contain previous custom search strings and a count of resources which match that search
  • Query searches, which provide prompts for available resource traits
  • Text searches, which provide a list of resources based on some property in the resource which matches the text prompt

Figure 2.1. Types of Search Suggestions

Types of Search Suggestions
When search terms are entered in the field, the matching substrings in possible matching resources are highlighted. By default, the suggestions can match any substring in the resource or in resource configuration traits. The suggestions can be limited to match the string at the beginning or end of the matching attribute using different operators (covered in Section 2.2, “About the Dynamic Search Syntax”).

Figure 2.2. Highlighting Search Terms

Highlighting Search Terms

2.2. About the Dynamic Search Syntax

JBoss ON has its own search syntax for dynamic searches. The syntax is supposed to be relatively simple while covering a wide array of search-able items and allowing different phrases to be coupled together.
The basic dynamic search matched whatever text is entered in the search box in a general substring search. The search can allow a more detailed and targeted syntax, in this form:
[search_area].[search_property] operator value operator additional_search
The search_area identifies what type of entry — resource or group — is being searched for. This is an optional value because the search area is implied by the location of the search; i.e., searching in the Resources area implies a resource search, so it's not necessary to include the resource. part of the search.

Figure 2.3. Searching by Resources Traits

Searching by Resources Traits

2.2.2. Property Searches

The search can be narrowed by looking for a specific value or type of attribute in the entry by using a search property. For example, looking for a resource with a CPU usage of 80% (trait) is different than looking for an entry with an ID that includes 80 (id). The available properties are listed in Table 2.2, “Resource Search Contexts” and Table 2.3, “Group Search Contexts”.
Note
It's possible to search using group criteria in the resource search, and the reverse, by specifying the search area and the appropriate properties. For example, it's possible to do a search in the groups area to return the list of groups that a specific resource belongs to. This is done by explicitly passing the search context and search property. For example, in the Groups page, to list any group which contains a resource managed by the Postgres plug-in:
resource.type.plugin = Postgres
Important
The parameter suggestions for connection, configuration, and trait use the internal property names for the property names (connection[property_name]) rather than the names used in the JBoss ON GUI.

Table 2.2. Resource Search Contexts

Property Description
resource.id The resource ID number assigned by JBoss ON.
resource.name The resource name, which is displayed in the UI.
resource.version The version number of the resource.
resource.type.plugin The resource type, defined by the plug-in used to manage the resource.
resource.type.name The resource type, by name.
resource.type.category The resource type category (platform, server, or service).
resource.availability The resource availability, either UP or DOWN.
resource.pluginConfiguration[property-name] The value of any possible configuration entry in a plug-in.
resource.resourceConfiguration[property-name] The value of any possible configuration entry in a resource.
resource.trait[property-name] The value of any possible measurement trait for a resource.
There are slightly fewer search properties for groups, since groups have simpler entries than resources.

Table 2.3. Group Search Contexts

Property Description
group.name The name of the group.
group.plug-in For a compatible group, the plug-in which defines the resource type for this group.
group.type For a compatible group, the resource type for this group.
group.category The resource type category (platform, server, or service).
group.kind The type of group, either mixed or compatible.
group.availability The availability of resource in the group, either UP or DOWN.
The operator first refers to how the results should match the search string (value). This can require an exact match, every value but the one given in the search string. The operator then refers to how multiple search strings relate to each other (AND or OR); both explicit AND and OR statements and parenthetical statements are allowed. Complex searches are covered in Section 2.2.3, “Complex AND and OR Searches”.

Table 2.4. Search String Operators

Operator Description
= Case-insensitive match.
== Case-exact match.
!= Case-insensitive negative match (meaning, the value is not the string).
!== Case-exact negative match (meaning, the value is not the string).

2.2.3. Complex AND and OR Searches

The dynamic search bar assumes that each individual word is a search term (unless terms are defined using quotation marks). Implicitly, multi-word searches are treated as AND searches. For example:
postgres server myserver
This is treated as a series of AND terms:
postgres AND server AND myserver
The dynamic search also allows OR searches, with terms separated by a pipe (|). For example:
postgres | jbossas
Both AND and OR searches can be entered, and complex searches can be written by stringing multiple search strings together. When there are both AND and OR search criteria, the AND terms are processed first. For example, this search term searches for both B and C, and then either A or B/C.
a | b c
Note
When there are both AND and OR terms used in a complex search, AND terms are given preference. However, terms in parenthesis are evaluated even before AND expressions, so parentheses can be used to override the natural search preference.
Search phrases can be nested to multiple levels using parentheses to group search terms. These parentheses can also be used to override the preferences for AND matches, forcing at least some OR expressions to be processed first. For example, this expression searches for the OR terms first, matching a OR b and c OR d, and then running an AND search on the results of the two OR searches:
(a | b) (c | d)
The results will contain several combinations of values: a c, a d, b c, and b d.
Multiple levels of nesting are allows. For example, this expression requires a AND either b OR c AND d:
(a) (b | (c d))
The matching resources, then, can contain values matching a c d or a b.

Chapter 3. Viewing and Exporting Reports

The purpose of JBoss ON is to deliver information about the resources in your infrastructure. There are a lot of places in the UI where information is displayed for individual resources or for defined groups.
The Reports main tab has a list of predefined searches and views into different areas for all resources, not just a subset.

3.1. Types of Reports

All reports show a list of information that spans everything in the JBoss ON inventory. Reports are broken into two categories, one for JBoss ON server subsystem areas and one for different inventory counts.
Subsystem reports are related to functionality in JBoss ON, for different aspects of monitoring, alerting, drift, and configuration. Most subsystem reports have some corollary to a resource-level chart, with much the same information displayed. Subsystem reports include additional columns to list the resource name and the resource ancestry (parent and grandparent resources) to disambiguate each resource, since names are not unique.
Inventory reports give counts. These reports usually begin with a breakdown by resource type, with additional resource lists available as "subreports."

Figure 3.1. Inventory Summary Report

Inventory Summary Report

Table 3.1. Types of Reports

Report Name Description Has Filters?
Subsystem Reports
Suspect Metrics Lists any metrics outside the established baselines for a given resource. All suspect metrics for all resources are listed, but the baselines which mark the metric may be different for each resource, even different between resources of the same type. No
Configuration History Lists all configuration changes, for all resources. Version numbers are incremented globally, not per resource. The configuration history shows the version number for the change, the date it was submitted and completed, its status, and the type of change (individual or through a group). No
Recent Operations Lists all operations for all resources, by date that the operation was submitted (not necessarily run), the operation type, and its status. Yes
Recent Alerts Lists every fired alert for all resources, with the name of the resource, the alert definition which was fired, and the alerting condition. Yes
Alert Definitions Lists all configured alert definitions, for all resources, with their priority and whether they are enabled. No
Recent Drift Contains a list of all snapshots, for all resources and drift definitions. Yes
Inventory Reports
Inventory Summary Contains a complete list of resources currently in the inventory, broken down by resource type and version number. No
Platform Utilization Shows the current CPU percentage, actual used memory, and swap space. No
Drift Compliance Shows a list of all resource types which support drift and then shows how many drift definitions are configured and whether the group is compliant. Clicking on a resource type shows the list of resources configured for drift and their individual compliance status. No

3.2. Exporting Report Data to CSV

The Reports tab collects information that is not easily accessible in other parts of the GUI or even the CLI, without complex scripting. The information from any report can be exported to CSV simply by clicking the Export button.

Figure 3.2. Exported Inventory Summary

Exported Inventory Summary
Only the information displayed in the report, as displayed in the report, is exported to CSV. If there is a certain sort order applied to the report or if a filter is used to limit the displayed entries, that sort order and that filter are preserved in the exported report CSV file.

Figure 3.3. Report with Date Filters

Report with Date Filters

Part I. Inventory, Resources, and Groups

Chapter 4. Interactions with System Users for Agents and Resources

The agent runs as a specific system user, and so do servers such as JBoss and Apache which are managed by JBoss ON. The general assumption with many of the agent management tasks, including discovery, is that the agent user is the same as the resource user. If the users are different, then that can have an impact on how resources can be discovered and managed.
The common types of servers which JBoss ON manages are:
  • JBoss EAP servers
  • PostgreSQL databases
  • Tomcat servers
  • Apache servers
  • Generic JVMs
For some management operations initiated by the JBoss ON agent, the agent system user is never even involved. For example, the JBoss EAP plug-in connects to the EAP instance using authentication mechanisms managed by JBoss EAP itself, so no system ACLs or user permissions are required. As long as the user can access the JBoss EAP instance, everything works.

Table 4.1. Cheat Sheet for Agent and Resource Users

Resource User Information
PostgreSQL
No effect for monitoring and discovery.
The agent user must have read/write permissions to the PostgreSQL configuration file for configuration viewing and editing.
Apache
No effect for monitoring and discovery.
The agent user must have read/write permissions to the Apache configuration file for configuration viewing and editing.
Tomcat Must use the same user or the agent can not be discovered.
JMX server or JVM Different users are fine when using JMX remoting; cannot be discovered with different users and the attach API.
JBoss AS/EAP
EAP 5 and earlier: Different users are all right, but require read permissions on run.jar and execute and search permission on all ancestor directories for run.jar.
EAP 6 and later:: The user running the agent must have read permissions to the application server's configuration files.

4.1. The Agent User

There is a general assumption that the agent runs as the same user as the managed resources, and this is the cleanest option for configuration.
When the JBoss ON agent is installed from the agent installer JAR file, the system user and group who own the agent installation files is the same user who installs the JAR. So, a special system user can be created or selected, and then the agent can be installed by that user.

4.2. Agent Users and Discovery

An agent discovers a resource by searching for certain common properties, such as PIDs and processes or start scripts.
It does not necessarily matter whether the agent has superior privileges as the resource user.
For most resources, the agent simply requires read access to that resource's configuration. For resources like Apache and Postgres, as long as the agent can read the resource configuration, the resources can be discovered.
For some other resources, the agent user has to have very specific permissions:
  • For JBoss EAP resources, the agent must have read permissions to the run.jar file, plus execute and search permissions for every directory in the path to the run.jar file.
  • When a JBoss EAP 6 instance is installed from an RPM, the agent user must belong to the same system group which runs the EAP instance. This is jboss, by default.
  • Tomcat servers can only be discovered if the JBoss ON agent and the Tomcat server are running as the same user. Even if the agent is running as root, the Tomcat server cannot be discovered if it is running as a different user than the agent.
  • If a JVM or JMX server is running with JMX remoting, then it can be discovered if the agent is running as a different user. However, if it is running with using the attach API, it has to be running as the same user as the agent for the resource to be discovered.

4.3. Users and Management Tasks

The system user which the agent runs as impacts several common agent tasks:
  • Discovery
  • Deploying applications
  • Executing scripts
  • Running start, stop, and restart operations
  • Creating child resources through the JBoss ON UI
  • Viewing and editing resource configuration
The key thing to determine is what tasks need to be performed and who needs to perform that operation, based on limits on the resource or the operating system for permissions or authorization.
For some actions — discovery, deploying applications, or creating child resources — setting system ACLs that grant the agent user permission are sufficient.
For running operations or executing scripts, it may be necessary to run the task as a user other than the agent user. This can be done using sudo.
Whatever method, the goal is to grant the JBoss ON user all of the required system permissions necessary to carry out the operations.

4.4. Using sudo with JBoss ON Operations

The time to use sudo is for long-running operations, such as starting a service or a process, or for scripts which are owned by a resource user. The user which executes the script should be the same as the resource user because that user already has the proper authorization and permissions.
The user can really be the same, or the JBoss ON user can be granted sudo rights to the given command.
When elevating the agent user's permissions, two things must be true:
  • There can be no required interaction from the user, including no password prompts.
  • It should be possible for the agent to pass variables to the script.
To set up sudo for resource scripts:
  1. Grant the JBoss ON agent user sudo rights to the specific script or command. For example, to run a script as the jbossadmin user:
    [root@server ~]# visudo
    
    jbosson-agent     hostname=(jbossadmin)  NOPASSWD: /opt/jboss-eap/jboss-as/bin/*myScript*.sh
    Using the NOPASSWD option runs the command without prompting for a password.
    Important
    JBoss ON passes command-line arguments with the start script when it starts an EAP instance. This can be done either by including the full command-line script (including arguments) in the sudoers entry or by using the sudo -u user command in a wrapper script or a script prefix.
    The second option has a simpler sudoers entry
  2. Create or edit a wrapper script to use. Instead of invoking the resource's script directly, invoke the wrapper script which uses sudo to run the script.
    Note
    For the EAP start script, it is possible to set a script prefix in the connection settings, instead of creating a separate wrapper script:
    /usr/bin/sudo -u jbosson-agent
    For example, for a start script wrapper, start-myScript.sh:
    #!/bin/sh
    # start-myScript.sh
    # Helper script to execute start-myConfig.sh as the user jbosson-agent
    #
    sudo -u jbosson-agent /opt/jboss-eap/jboss-as/bin/start-myConfig.sh
  3. Create the start script, with any arguments or settings to pass with the run.sh script. For example, for start-myConfig.sh:
    nohup ./run.sh -c MyConfig -b jonagent-host 2>&1> jboss-MyConfig.out &

Chapter 5. Managing the Resource Inventory

The inventory in JBoss ON is the repository that contains all of the servers and applications that are managed or monitored by JBoss Operations Network. The inventory tells JBoss Operations Network which resources it can manage.
Once in the inventory, resources can be organized in several different ways. Resources can be grouped automatically by their type in autogroups, resources can be added manually to user-defined groups, and they can be added manually to another resource as a child.
This section covers the process of identifying and importing resources through discovery, adding children, and managing groups.

5.1. About the Inventory: Resources

The JBoss ON inventory is the central list of every managed resource that is recognized by the JBoss ON server.

5.1.1. Managed Resources: Platforms, Servers, and Services

Each JBoss ON agent periodically scans the platform where it's installed to check for services and servers. That is the discovery process. When a potential resource is discovered, then it is listed in the discovery scan results, and, from there, an administrator can choose whether it should be managed by JBoss ON. If a resource should be managed, then it must be imported into the JBoss ON server's inventory; otherwise, it can be ignored.
There are three categories of resources in JBoss ON:
  1. Platforms (operating systems)
  2. Servers
  3. Services
The resource hierarchy in the JBoss ON inventory mimics how programs and processes are physically structured on a platform. The highest level is the platform. The platform can have both servers and services as children. Likewise, servers can have both other servers and services as children, while services can have only other services as children within the inventory.

Figure 5.1. An Example Resource Hierarchy

An Example Resource Hierarchy
A handful of rules govern the relationships between resources in inventory:
  • A resource can only have one parent.
  • A server can be a child of a platform (such as JBoss AS on Linux) or another server (such as Tomcat embedded in JBoss AS).
  • A service can be a child of a platform, a server (such as the JMS queue on JBoss AS), or another service (e.g. a table inside a database).
  • Platforms, servers, and services can have many children services.
JBoss ON can manage many different types of resources; each managed resource has a corresponding agent plug-in which defines things like the available monitoring metrics, operations, and supported versions for the resource.
Note
Additional resources can be added by writing custom plug-ins for the resource type.

5.1.2. Content-Backed Resources

For application servers, there is a close conceptual link between a child resource and content which is deployed on the server. For example, EARs and WARs are both child resources of application servers and are versioned content which can be stored in a repository, selectively deployed, and reverted.
A content-backed resource is treated as a resource in that it has a place in the inventory hierarchy, can have operations run against it, and can have metrics collected for it. However, it is also managed as a software package, with bits that are uploaded and stored in the JBoss ON content system and maintained within a repository.
Content-backed resources can be manually added as children or, if deployed outside JBoss ON, can be detected in a package discovery scan (which runs every 24 hours by default). These child resources can also be created and updated by deploying content from the repositories in JBoss ON.
For more information on managing content-backed resources, see Chapter 32, Managing JBoss EAP 6 (AS 7) and Chapter 31, Managing JBoss EAP 5.
Important
Content-backed resources can have a significant impact on disk space requirements.
JBoss ON stores all versions of content. This is part of versioning control, allowing changes to content-backed resources to be reverted and managed.
Therefore, the system which hosts the backend database (Oracle or PostgreSQL) must have enough disk space to store all versions of all content for any resource. Additionally, the database itself must have adequate tablespace for the content.
When calculating the required amount of space, estimate the size of every artifact, and then the number of versions for each artifact. At a minimum, have twice that amount of space available; both PostgreSQL and Oracle require twice the database size to perform cleanup operations like vacuum, compression, and backup and recovery.

5.1.3. Resources in the Inventory Used by JBoss ON

Some resources are automatically added to platforms to enable certain JBoss ON-specific functionality. For example, an Ant bundle handler resource is added to platforms as a child service to allow the agent to identify and process Ant recipes. [1] Without that Ant bundle handler resource, the JBoss ON agent cannot perform provisioning on that platform. Administrators do not have to interact directly with JBoss ON-specific child resources once they are in the inventory, but these child resources must be present for JBoss ON functionality to work. Because these children are required by JBoss ON, they are imported automatically with the platform.
Other resources can be added to the inventory for the JBoss AS server used by the JBoss ON server, the JBoss ON agent, and JBoss elements associated with the agent and server such as the agent JVM, JBoss Cache, and the agent launch script and rhq-agent-env.sh script. Adding these resources to the JBoss ON inventory allows JBoss ON to monitor and manage all of the agents and servers in the deployment.

5.2. Discovering Resources

Before any application or platform can be managed by JBoss ON, it must be imported into the inventory. There are different ways of adding resources to the inventory, depending on how the resource was discovered.

5.2.1. Finding New Resources: Discovery

When an agent is installed and every time it starts up, it scans the platform, and all applications on it, for any servers, services, or other items which can be included into the inventory. The process of finding potential resources is called discovery.
There are different scans for each type of resource: platform, server, and service. High level scans for servers and platforms are initiated by the agent every 15 minutes. A service scan detects lower-level services that are running in servers that have already been imported into the inventory. These scans run by default every 24 hours. Both of these intervals are configurable in the JBoss ON agent configuration.
The agent always runs a scan for new resources when it starts up, and then periodically at its configured intervals. There is a delay between when the parent and immediate resources are imported and when another discovery scan is initiated to discover lower-level children; this prevents possibly long-running recursive discovery scans for resources which may potentially be ignored.
Note
All of the discovery scan intervals are configurable in the agent's configuration file.
JBoss ON agents send information about the platform and servers it discovers back to the JBoss ON server.
A server must be imported into the inventory before any of its child processes, servers, or services can be detected by the discovery scan.
When a platform is imported into the inventory, several of its child servers and services are imported automatically as well. This includes resources that are vital to the platform (like CPU, network adapters, and filesystems) as well as resources that are used by the JBoss ON server itself (such as the Ant bundle handler resource, which is used by the provisioning subsystem).
Although discovery is run automatically by the agent, discovery can also be initiated manually to capture infrastructure changes immediately.

5.2.2. Running Discovery Scans Manually

Discovery scans are run automatically by the agent to identify new resources as they are added to a platform. After the agent runs a full discovery scan upon server start, the server scans are run every 15 minutes and service scans every 24 hours. New resources can be added between the discovery scans, therefore administrators can initiate a manual discovery scan asynchronously.
The simplest way to initiate a discovery scan is to run the agent's discovery command at the agent command prompt:
  1. Click the Inventory tab in the top menu.
  2. Select Platforms from the left menu, then the platform on which a scan should run.
  3. Open the Operations tab. In the Schedules sub-tab, click the New button.
  4. Open the Servers - Top Level Resources link on the left, and select the agent resource.
  5. Select the Manual Discovery operation from the drop-down menu, and select whether to run a detailed discovery (servers and services) or a simple discovery (servers only).
  6. In the Schedule area, select the radio button to run the operation immediately.
  7. Click the Schedule button to set up the operation.

5.2.3. Importing Resources from the Discovery Queue

  1. Click the Inventory tab in the top menu.
  2. In the Resources menu on the left, select Discovery Queue.
  3. Select the checkbox of the resources to be imported. Selecting a parent resource (such as a platform) gives the option to automatically import all of its children, too.
  4. Click the Import button at the bottom of the UI.

5.2.4. Ignoring Discovered Resources

When the JBoss ON agent discovers an application or service which is to be ignored from the inventory, the server can be instructed to ignore those resources in the discovery queue. If the resource has already been imported to the inventory, see Section 5.2.5, “Ignoring Imported Resources”.
Note
A resource can only be ignored if its parent is already added to the inventory.
  1. Select Inventory from the top menu.
  2. Select the Discovery Queue item under the Resources menu on the left side of the screen.
  3. Select the checkbox of the resource to be ignored. Selecting a parent resource automatically selects all of its children.
  4. Click the Ignore button at the bottom of the page.
Note
It is not possible to ignore a platform. If a platform should not be in the inventory, do not run an agent on that machine.

5.2.5. Ignoring Imported Resources

Once a resource has been imported into the JBoss ON inventory, the server and the managing agent can be set to ignore those resources. There are three locations within the JBoss ON User Interface where you set a resource to ignore;
  • Inventory Resources pages,
  • the Inventory page of the parent resource, or
  • the resource Groups inventory page.
Note
If the resource to be ignored has been discovered but not yet imported into inventory, see Section 5.2.4, “Ignoring Discovered Resources”.
Note
It is not possible to ignore a platform. If a platform should not be in the inventory, you should remove the JBoss ON agent on that machine.

5.2.5.1. Ignoring Resources from a Resources page

  1. From the Inventory menu, select the relevant resource view under Resources. For example;
    • Inventory > Resources > All Resources, or
    • Inventory > Resources > Services.
  2. Select the row containing the resource to ignore. Multiple resources can be selected if required.
  3. Click the Ignore button at the bottom of the page.

5.2.5.2. Ignoring resources from the Inventory page of the parent resource

  1. From the Inventory menu, select the relevant resource view under Resources. For example;
    • Inventory > Resources > All Resources, or
    • Inventory > Resources > Services.
  2. Locate and select the parent resource from the resource list.
  3. Within the parent resource page, select the Inventory tab.
  4. From the parent resources Inventory tab, select the Child Resources sub-tab.
  5. Select the row containing the resource to be ignored from the Child Resources list. Multiple rows can be selected if required.
  6. Click the Ignore button at the bottom of the page.

5.2.5.3. Ignoring resources from a Groups page

  1. From the Inventory menu, select the relevant resource group under Groups. For example;
    • Inventory > Groups > All Groups, or
    • Inventory > Groups > Compatible Groups
  2. Locate the resource group that contains the resource to be ignored.
  3. Within the resource group page, select the Inventory tab.
  4. From the resource groups Inventory tab, select the Members sub-tab.
  5. Select the row containing the resource to be ignored from the Members list. Multiple rows can be selected if required.
  6. Click the Ignore button at the bottom of the page.

5.2.6. Ignoring an Entire Resource Type

The procedure in Section 5.2.4, “Ignoring Discovered Resources” is for ignoring a single, specific resource after it has been discovered on the system. Other resources of the same type on that platform or other platforms will still be discovered and included in the inventory. Ignoring that one resource has no impact on the discovery process.
However, there may be certain types of resources which will never be imported into the inventory. For example, filesystem resources may always be ignored on platforms, or a low-level EJB service may always be ignored on an EAP 6 server. In those instances, there is a maintenance overhead to discovering resources on the agent, having an administrator manually ignore them, and then maintaining them in inventory as an ignored resource.
If a certain type of resource will never be monitored or managed by JBoss ON, then that entire resource type can be ignored. This essentially disables discovery for that resource type.
Ignoring a resource type is a global, JBoss ON-wide setting and is configured in the JBoss ON server configuration.
Note
It is not possible to ignore a platform. If a platform should not be in the inventory, do not run an agent on that machine.
  1. In the top menu, click the Administration tab.
  2. In the Configuration menu table on the left, select the Ignored Resource Types item.
  3. Every available resource type, based on the loaded agent plug-ins, is listed in the Ignored Resource Types page. To ignore a resource, click the pencil icon.
    That toggles whatever the current enabled/disabled setting is for ignoring the resource. If a resource type is enabled, then it will be discovered by the agent. If it is disabled, it will be ignored.
  4. Scroll to the bottom of the page and click the Save button.

5.3. Resources That Require Additional Configuration for Discovery

There are some resource types which require specific configuration for the resource or for the JBoss ON agent in order for them to be discovered.

5.3.1. Configuring the Agent to Discover EAP Instances

As covered in Chapter 4, Interactions with System Users for Agents and Resources, the system user as which the agent runs has a direct effect on how the agent can manage certain resource types. For EAP instances, the agent's system user must have the appropriate permissions to be able to manage EAP resources:
  • For JBoss EAP 5 and earlier, the agent must have read permissions to the run.jar file, plus execute and search permissions for every directory in the path to the run.jar file.
  • For JBoss EAP 6 and 7, the agent must have read permissions to the application server's configuration files.
  • When a JBoss EAP instance is installed from an RPM, the agent user must belong to the same system group which runs the EAP instance. This is jboss, by default.

5.3.2. Configuring Tomcat/EWS Servers for Discovery (Windows)

Tomcat servers are discovered automatically on Linux and Unix systems, but they require additional configuration before they can be discovered on Windows systems.
  1. Run regedit.
  2. Navigate to Java preferences key for the Tomcat server, HKEY_LOCAL_MACHINE\SOFTWARE\Apache Software Foundation\Procrun2.0\TomcatVer#\Parameters\Java.
  3. Edit the Options attribute, and add these parameters:
    -Dcom.sun.management.jmxremote.port=9876
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.authenticate=false
  4. Restart the Tomcat service.
After a few minutes, the Tomcat instance should show up in the Discovery Queue.

5.4. Importing New Resources Manually

Discovery scans are run on a defined schedule. There may be an instance where you add a new server or service on a platform and want to add it immediately to the JBoss ON inventory, before the next scheduled discovery run. It is possible to add that new child resource manually by importing it into the inventory of the parent resource — without waiting for the next discovery scan.
Note
The parent resource must be in an available state in order to import a child resource.
  1. Click the Inventory tab in the top menu.
  2. Search for the parent resource of the new resource.
    Chapter 2, Dynamic Searches for Resources and Groups has information on searching for resources using dynamic searches.
  3. Click the Inventory tab of the parent resource.
  4. Click the Import button in the bottom of the Inventory tab, and select the type of child resource. The selection menu lists the possible types of child resources for that parent.
  5. Fill in the properties to identify and connect to the new resource. Each resource type in the system has a different set of required properties.

5.5. Creating Child Resources

JBoss ON can create certain types of children resources for parent resources. For example, a Postgres server can allow JBoss ON to create Postgres users. Not every allowed child resource type for a resource can be created through the JBoss ON UI; these children are usually limited to resource types that can be configured simply and remotely, such as scripts, WAR/EAR files, and server users.
Note
The parent resource must be available to add a child resource.
Note
It can take several minutes for the new child resource to be added and visible in the JBoss ON inventory because the new resource has to be created on the local system and then discovered by the agent. If the discovery scan is running when the resource is created, then it may take until the next discovery scan to be detected.
  1. Click the Inventory tab in the top menu.
  2. Search for the parent resource of the new resource.
    Chapter 2, Dynamic Searches for Resources and Groups has information on searching for resources using dynamic searches.
  3. Click the Inventory tab of the parent resource.
  4. Click the Create Child button in the bottom of the Inventory tab, and select the type of child resource. The selection menu lists the possible types of child resources for that parent.
  5. Give the name and description for the new resource.
  6. Fill in the properties to identify and connect to the new resource. Each resource type in the system has a different set of required properties.

5.6. Viewing and Editing Resource Information

Every resource has details about the server or service that can be viewed, such as its name, description, and version. (The specific information is different for each resource type.) These details are usually hidden when viewing the resource, but they can be viewed by clicking the arrow by the resource name to expand the details area.

Figure 5.2. Expanding Resource Entry Details

Expanding Resource Entry Details
Any fields with green text can be edited. This allows administrators to use more specific or useful information in areas that are supplied by the agent discovery, like the resource name, or to add information, like a description or, for example, a platform's physical location.
To edit a field, hover the cursor over the name and click the pencil icon that appears.
When the edits are made, click the green check mark to save the changes.

5.7. Managing Connection Settings

Connection settings define how an agent can connect to a resource. These are settings defined in the agent plug-in, so they are different for each resource type.
At a minimum, connection settings provide a way for the agent to connect to a resource, such as a port number, directory path, or user credentials.
Note
Often, if a resource shows down availability even when it is running, it is a problem with the connection settings. The agent may not have information it requires, such as a username or new port number, that it requires to connect to the resource. Since the agent cannot connect to the resource, it assumes it is down.
Connection settings can also provide configuration for other controls defined in the plug-in descriptor, like paths or options to use with scripts for operations, log file locations for event monitoring, and configuration files to allow for resource configuration editing.
To edit the connection settings:
  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
    Chapter 2, Dynamic Searches for Resources and Groups has information on searching for resources using dynamic searches.
  3. Click the name of the resource to go to its entry page.
  4. Open the Inventory tab for the resource, and click the Connection Settings subtab.
  5. Change the connection information for the resource.
    If a field is not editable immediately, select the Unset checkbox, and then enter new information in the field.
  6. Click the Save button.

5.8. Uninventorying and Deleting Resources

A resource can be removed from the JBoss ON inventory in one of two ways: it can be uninventoried (all resources) or it can be deleted (content-backed resources or configuration-related resources).

5.8.1. A Comparison of Uninventorying and Deleting Resources

Uninventorying a resource permanently and irrevocably removes all data about that resource from the JBoss ON inventory. It removes all historical monitoring data, configuration and operation histories, alerts, drift definitions, and any other stored data. However, the resource itself remains intact — it still exists on the machine and can be rediscovered (as a new resource) at a later time.
Any resource can be uninventoried.
Deleting a resource, on the other hand, completely removes the resource from the machine itself. So, deleting an EAR resource deletes the EAR from its parent EAP instance. However, the inventory information about that resource — its historic metric data, configuration history, and version history (for content resources) — remains intact, so that if a new version of that child is ever deployed, it is retains all of its original history.
Only resources not discovered through the discovery queue (such as deployed EAR files or created datasources) can be deleted.

5.8.2. Use Caution When Removing Resources

Uninventory Irrevocably Deletes the Resource History and Data

Uninventorying a resource removes all of the data that JBoss ON has for that resource: its metric data and historical monitoring data, alerts, drift and configuration history, operation history, and other data. Once the resource is uninventoried, its data can never be recovered.

Uninventorying or Deleting a Resource Removes All of Its Children

If a parent resource is removed from JBoss ON, then all of its children are also removed. Removing an EAP server, for example, removes all of its deployed web applications from the JBoss ON inventory. Removing a platform removes all servers, services, and resources on that platform.

Uninventoried Resources Can Still Be Discovered

Even though a resource is uninventoried and all of its data in JBoss ON is permanently removed, the underlying resource still exists. This means that the resource can still be discovered. To prevent the resource from being discovered and re-added to the inventory, ignore the resource, as in Section 5.2.4, “Ignoring Discovered Resources”.

Anything Depending on a Deleted Resource Could Fail

Some resource types can be deleted, meaning the resource itself is removed from the machine, not just from the JBoss ON inventory. Anything that relies on that resource can experience failures because the resource is deleted. For example, if a datasource for an EAP server is deleted, that datasource is removed from the EAP server itself. Any application which attempts to connect to that datasource will then stop working, since it does not exist anymore.

5.8.3. Uninventorying through the Inventory Tab

  1. Click the Inventory tab in the top menu.
  2. Select the resource category in the Resources table on the left, and, if necessary, filter for the resource.
  3. Select the resource to uninventory from the list, and click the Uninventory button.
  4. When prompted, confirm that the resource should be uninventoried.
  5. To prevent the resource from being re-imported into the inventory, ignore it when it is discovered in the next discovery scan. This is covered in Section 5.2.4, “Ignoring Discovered Resources”.

5.8.4. Uninventorying through the Parent Inventory

  1. Click the Inventory tab in the top menu.
  2. Search for the parent resource of the resource.
    Chapter 2, Dynamic Searches for Resources and Groups has information on searching for resources using dynamic searches.
  3. Click the Inventory tab for the parent resource.
  4. Click on the line of the child resource to uninventory. To select multiple entries, use the Ctrl key.
  5. Click the Uninventory button.
  6. When prompted, confirm that the resource should be uninventoried.
  7. To prevent the resource from being re-imported into the inventory, ignore it when it is discovered in the next discovery scan. This is covered in Section 5.2.4, “Ignoring Discovered Resources”.

5.8.5. Uninventorying through a Group Inventory

If a resource is a member of a group, then the resource can be uninventoried through the group management pages, as part of managing the group resources.
  1. In the Inventory tab in the top menu, select the compatible or mixed groups item in the Groups menu on the left.
  2. Click the name of the group.
  3. Open the Inventory tab for the group, and open the Members submenu.
  4. Click on the line of the group member to uninventory. To select multiple entries, use the Ctrl key.
  5. Click the Uninventory button.
  6. When prompted, confirm that the resource should be uninventoried.
  7. To prevent the resource from being re-imported into the inventory, ignore it when it is discovered in the next discovery scan. This is covered in Section 5.2.4, “Ignoring Discovered Resources”.

5.8.6. Deleting a Resource

Deleting a resource does several things:
  • Deletes the resource from the underlying machine.
  • Removes the resource from the inventory.
  • Removes any child resources from JBoss ON.
  • Preserves the inventory information in JBoss ON for the resource, including alerts, drift definitions, metric data, and configuration and operation histories.
Only a resource not imported through the discovery queue can be deleted. This generally means that content-backed resources (EARs, WARs, and JARs) and other child resources like datasources can be deleted.
Warning
Because the real, underlying resource is deleted (not just the inventory entry), anything relying on that resource can experience failures.
  1. Click the Inventory tab in the top menu.
  2. Search for the parent resource of the resource to delete.
    Chapter 2, Dynamic Searches for Resources and Groups has information on searching for resources using dynamic searches.
  3. Click the Inventory tab of the parent resource.
  4. Select the resource to delete from the list of children.
  5. Click the Delete button in the bottom of the Inventory tab.

5.9. Viewing Inventory Summary Reports

One quick management tool in JBoss ON is an inventory report. The report summarizes the resources currently in the inventory, grouped by resource type and five summaries:
  • Resource type
  • The JBoss ON server plug-in which manages the resource
  • The JBoss ON category for the resource (platform, server, or service)
  • The version number or numbers for resource of the resource type in inventory
  • The total number of resources of that type in the inventory
To generate the inventory report:
  1. In the top menu, click the Reports tab.
  2. In the Inventory menu box in the menu table on the left, select the Inventory Summary report.
  3. Click the name of any resource type to go to the inventory list for that resource type.
Note
Reports can be exported to CSV, which can be used for office systems or further data manipulation.
To export a report, simply click the Export button. The report will automatically be downloaded as inventorySummary.csv.


[1] Provisioning Ant bundles is implemented through an agent plug-in which performs the tasks on the platform and a server-side plug-in which manages the bundles in the server.

Chapter 6. Managing Groups

Groups are a simple, yet effective, way to organize resources. Particularly where there are large numbers of resources or where there are logical divisions between resources across departments, IT environments, or physical locations.
Groups in JBoss Operations Network provide a way to manage resources easily and more consistently. Alerts, operations, and configuration can be applied to individual resources or to entire groups of resources, while groups can be monitored from a single view.

6.1. About Groups

Groups are simply a means to organize resources within the JBoss ON inventory. JBoss ON has several different kinds of groups, listed in Table 6.1, “Types of Groups”, which allows an administrator to manage resources in different, flexible ways.

Table 6.1. Types of Groups

Type Description Static or Dynamic
Mixed groups Contains resources of any resource type. There is no limit to how many or what types of resources can be placed into a mixed group. Mixed groups are useful for granting access permissions to users for a set of grouped resources. Static
Compatible groups Contains only resources of the same type. Compatible groups make it possible to perform an operation against every member of the group at the same time, removing the need to individually upgrade multiple resources of the same type, or perform other operations one at a time on resources across the entire enterprise. Static
Recursive groups Includes all the descendant, or child, resources of resources within the group. Recursive groups show both the explicit member availability and the child resource availability. Static (members) and dynamic (children)
Autogroups Shows every resource as part of a resource hierarchy with the platform at the top, and child and descendant resources below the platform. Child resources of the same type are automatically grouped into an autogroup. Dynamic

6.1.1. Dynamic and Static Groups

Groups are a way of organizing resources. The different types of groups are covered in , but all of these groups fall into one of two categories. Groups are either static or dynamic, depending on how resources are assigned to the group. Static groups have resources which are explicitly assigned to the group, so the membership does not change even if the inventory changes. Dynamic groups are based on some kind of search criteria, and the group members are all of the resources returned in that search. Whenever the inventory is updated, the search results change, and the group membership is automatically updated.
Both static and dynamic groups can be valuable for managing resources and keeping a perspective on the overall IT environment.

6.1.2. About Autogroups

There are two basic types of groups in JBoss ON: static groups, where resources are added manually, and dynamic groups, where resources are added automatically based on some kind of established criteria.
Administrators can configure dynamic groups based on defined searches, which is covered in Chapter 7, Using Dynamic Groups. JBoss ON supports a different kind of dynamic group called an autogroup. Autogroups are used to construct the inventory navigation trees in the JBoss ON UI, and they are based on the underlying resource hierarchy, or parent-child relationships. Autogroups also group along resource type. For example, in Figure 6.1, “PostgreSQL Autogroup”, there are autogroups under the Postgres resource for all its children, which are further divided based on the child resource type, databases and users.

Figure 6.1. PostgreSQL Autogroup

PostgreSQL Autogroup
Autogroups, unlike other groups in JBoss ON, are not configurable by JBoss ON users. Autogroups are defined internally in the JBoss ON server and are used by JBoss ON.

6.1.3. Comparing Compatible and Mixed Groups

Using groups allows multiple resources to be managed simultaneously. The type of group — compatible or mixed — specifies what kind of management can be performed on the group members.
Compatible groups, because they have members all of the same type, can be managed almost as easily as a single resource. Administrators can change resource configuration, launch operations, set alerts, and view individual and group-averaged monitoring data. Any changes can be made to a single group member, selected members, or the entire group. The list of group members, the group inventory, is managed through the Inventory tab.

Figure 6.2. Compatible Group Entry

Compatible Group Entry
Mixed groups can have members of different resource types, so group management is limited to updating the members (the group inventory) and viewing the history of alerts and events for the group members.

Figure 6.3. Mixed Group Entry

Mixed Group Entry

6.1.4. Leveraging Recursive Groups

Compatible and mixed groups have a set, explicit membership. This static structure makes them useful for creating policies within JBoss ON because they are reliable.
Compatible and mixed groups can have a setting on them making them recursive. A recursive group travels down the inventory of every member and implicitly adds all of their children to the group, too. In a sense, a recursive group has two tiers of members: the explicit members to the group and then the implicit set of child members.
This idea of two levels of members allows a group to retain its own type definition — meaning specifically that it does not change a compatible group to a mixed group just because its children are added as members. The explicit membership is called the root node. Group operations, from adding members to setting metrics schedules to changing connection settings, are only performed on that root node.
Recursive groups can be useful in a lot of different ways.
For example, recursive compatible groups can be used to look at a subset of autoclusters in the inventory because all of the child resources are grouped by type, as are their parent resources. This provides an easy view at a subset of resources. (Mixed groups, which do not group by type, show only the root node or explicit members in the hierarchy.)
One of the more important uses is using recursive groups, particularly mixed recursive groups, for authorization control. This is covered more in Chapter 9, Managing Roles and Access Control, but users have to be granted access to resources explicitly, by adding the user and the resource to the same role. Using a recursive group automatically includes all of those resources' children in the role, which makes the role easier to maintain and more accurate in granting access.

6.2. Creating Groups

A user must have the global security or inventory permission to create groups.
  1. Click the Inventory tab in the top menu.
  2. In the Groups box in the left menu, select the type of group to create, either compatible or mixed.
    Compatible groups have resources all of the same type, while mixed groups have members of different types. The differences in the types of members means that there are different ways that compatible and mixed groups can be managed, as covered in Section 6.1.3, “Comparing Compatible and Mixed Groups”.
  3. Enter a name and description for the group.
    Marking groups recursive can make it easier to manage resources, particularly when setting role access controls. For example, administrators can grant users access to the group and automatically include any child resources of the member resources.
  4. Select the group members. It is possible to filter the choices based on name, type, and category.

6.3. Changing Group Membership

Compatible and mixed groups both have static members, which means that resources are manually assigned to the group rather than being assigned dynamically based on some attribute. The group membership can be changed as the resources in the JBoss ON inventory change.
  1. In the Inventory tab in the top menu, select the compatible or mixed groups item in the Groups menu on the left.
  2. Click the name of the group.
  3. Open the Inventory tab for the group, and open the Members submenu.
  4. Click the Update Membership button at the bottom of the page.
  5. Select the resources to add to the group from the box on the left; to remove members, select them from the box on the right. Use the arrows to move the selected resources. To select multiple resources, use Ctrl+click.
  6. Click the Save button.

6.4. Editing Compatible Group Connection Properties

Compatible groups manage the connection properties of the group members as part of the inventory. Since a compatible group can only contain members of the same resource type, it is possible to see an aggregate view or average of all of their individual connection properties. The connection settings define how the agent or server connects to the resource.
The rules are for the values of compatible group connections are simple:
  • If all of the resources in the group have identical values for a property, the group connection property is that exact value.
  • If even one resource has a different value than the rest of the resources in the group, that property will have a special marker value of ~ Mixed Values ~.
To edit the connection properties:
  1. In the Inventory tab in the top menu, select the Compatible Groups item in the Groups menu on the left.
  2. Click the name of the compatible group.
  3. Open the Inventory tab for the group, and click the Connection Settings sub-item.
  4. To edit a property, click the pencil by the field.
  5. To change all resources to the same value, click the Unset checkbox for the field Set all values to.... To change a specific resource, click the Unset checkbox for that resource and then give the new value.
Note
Refreshing the inventory tab shows the current values of the connection properties for each resource in the group. If the update has not yet completed, but it has successfully changed some of the resource's connection properties, intermediate values are displayed. Just ignore these values. Once the updates have completed, refresh the page to view final results.
The Connection Settings History sub-item shows the changes made to the connection properties. If there is a failure, clicking the hyperlink in the Date Created column opens any relevant error messages.

Chapter 7. Using Dynamic Groups

A dynamic groups specifies a search term to use to search the inventory and identify matching resources to belong to the group. Since the search results change automatically as results are added and removed from the inventory, the group membership is always changing and always current. Using dynamic groups helps automate management tasks for large inventories.
Note
Dynamic groups are also referred to by the nickname dynagroups.
Enterprise resources can be grouped by cluster identifier, broadcast group, logical service layer, geographical location, security domain, or any other logical grouping.
Individual resources can potentially belong to multiple groups, in large inventories it is important to know how different group definitions will affect the enterprises resources.

7.1. About Dynamic Groups Syntax

Dynamic groups are configured through group definitions. A group definition uses expressions which define searches for resources, along with other information about the group like the recalculation interval.
Dynamic groups have an expression syntax very similar to the one used for dynamic searches (Section 2.2, “About the Dynamic Search Syntax”).

7.1.1. General Expression Syntax

The expression is a search condition that is centered around a specific resource attribute, either by a specific value of an attribute or simply by the presence of an attribute.
An expression defines a way to group resources:
  • By a specific resource attribute or value (a simple expression)
  • By the resource type (a pivoted expression)
  • By membership in another group (a narrowing expression)
A single group definition can have multiple expressions. The order of expressions in the group definition does not matter; for example, both of these expressions are interpreted exactly the same when calculating the group members:
expression 1
exprA1 
exprA2 
groupby exprB1 
groupby exprB2

expression 2
exprA2 
exprA1 
groupby exprB2 
groupby exprB1
Note
When multiple expressions are used in a group definition, they are treated as logical AND expressions, and a resource must match all the criteria to belong to the group.
Any empty lines between expressions in a dynagroup definition are ignored.
The possible resource properties cover resource information like the resource name, type, plug-in, version, configuration property, and inventory ID number.

Table 7.1. Dynamic Group Properties

Type Supported Attributes
Related to the resource itself
resource
id
name
version
parent
grandparent
children
Related to the resource type
resourceType
plug-in
name
category (platform, server, service)
Related to the resource configuration
plug-inConfiguration
Any plugin configuration property
resourceConfiguration
Any resource configuration property
Related to the resource monitoring data
traits
Any monitoring trait
availability
The current state, either UP or DOWN
If an expression has the structure resource.attribute, then it applies to the resource which will be a member of the group. However, it is possible to use an attribute in an ancestor or child entry to identify a group member recursively.
For example, to add a resource as member which as an inventory ID of 10001, the expression is:
resource.id = 10001
To add all of the children of a resource with an ID of 10001 as group members, use the prefix resource.parent:
resource.parent.id = 10001
There are four possible prefixes for all of the resource attributes in Table 7.1, “Dynamic Group Properties”:
  • resource
  • resource.child
  • resource.parent
  • resource.grandParent
If a definition is so restrictive that no resources match the filters, no group is created. JBoss ON will actually suppress a group from being created by a group definition if it would result in an empty group. Because there are no empty groups, there are no extraneous groups listed in the inventory, which makes managing inventories easier and better reflects your real infrastructure.

7.1.2. Simple Expressions: Looking for a Value

A simple expression uses an attribute-value pair or triad in this format:
resource.attribute[string-expression] = value
For example:
resource.parent.type.category = Platform
Not every resource attribute has an additional string-expression; a string-expression is basically a sub-attribute. For example, resource.trait is the generic resource attribute, and a sub-attribute like partitionName identifies the actual parameter.
Simple expressions usually search for resources based on an explicit value, but resources may have attributes present with null values, and those null values would be not returned with a simple expression. The empty keyword searches for resources which have a specific attribute with a null value:
empty resource.attribute[string-expression]
If the empty keyword is used, then there is no value given with the expression.
Simple expressions can also use a not empty keyword, which looks for every resource with that attribute, regardless of the attribute value, as long as it is not null. As with the empty keyword, there is no reason to give a value with the expression, since every value matches the expression.
not empty resource.attribute[string-expression]

7.1.3. Pivot Expressions: Grouping by an Attribute

Simple expressions create a single group, because they are based on the specific results of the search. Alternatively, a pivot expression creates multiple groups because it identifies resources which belong to the group solely based on whether an attribute exists; it creates subgroups based on the values. A pivot expression uses the groupby keyword:
groupby resource.attribute
Pivoted expressions create groups based on unique occurrences of an attribute value. For example, the parent.name attribute creates a unique group based on every parent resource.
groupby resource.parent.name
For the resources in Figure 7.1, “Resources and Parents”, the pivot expression creates groups for the three parents within the resource hierarchy: ResourceParentA, ResourceParentB, and ChildA2.

Figure 7.1. Resources and Parents

Resources and Parents
If the overall group definition includes resources with null values, then the pivot expression creates a special subgroup that contains those resources.

7.1.4. Narrowing Expressions: Members of a Group

Expressions are generally evaluated across the entire inventory. For example, setting a pivot expression for the JBoss AS 7 Server resource type checks the inventory for every EAP 6 or AS 7 instance.
In some occasions, particularly if trying to create fine-grained groups for access control or bundle management, it is simpler to define an expression to run against the members of an existing group, rather than trying to devise a complex expression to get a subset of the same type of resource.
This is done by specifying the memberof keyword. This specifies a group name (compatible group, mixed group, autogroups, recursive group, even another dynagroup), and only members of that group are evaluated for other matching expressions.
Note
The memberof keyword specifies a group name. If the group is a recursive group, then all of the recursive members are included as part of the group for evaluation.
For example, an administrator creates different groups for application development related resources for development, QE, and production teams. These are mixed groups of platforms, Postgre databases, EAP 6 servers, and web contexts. When new resources are deployed, a CLI script is run to import the resources and automatically update the resource groups. Access control rules and roles need to use both the main groups — such as Dev Stack Resource Group, a mixed group — and subgroups within that group, based on resource types. Rather than creating and updating multiple groups and roles every time resources are deployed or removed, the administrator creates a dynagroup expression which first narrows the scope of the expression to the given team resource group, and then pivots by resource type. The dynagroup definition only needs to be set once, and it is dynamically updated with inventory changes every time the group is recalculated.
memberof = "Dev Resource Group"
groupby resource.type.name
Note
The group membership, as with other expressions, is only updated with memberof changes when the dynagroup is recalculated.
Multiple memberof expressions are allowed in the group definition, with each memberof expression referencing a single group. If multiple memberof expressions are used, they are treated as AND expressions; any matching resource must be a member of all specified groups.

7.1.5. Compound Expressions

Multiple expressions can be used in a single dynagroup definition; these are compound expressions.
When multiple expressions are used in a group definition, they are treated as logical AND expressions, and a resource must match all the criteria to belong to the group. (Any empty lines between expressions in a dynagroup definition are ignored.)
For example, this basic expression searches for every resource which has the platform as its parent:
resource.parent.type.category = Platform
That could return a very long list of servers and services. That initial list can be further filtered by adding another simple expression, which filters by name:
resource.parent.type.category = Platform
resource.name.contains = JBossAS
Only resources with the platform as the parent and with the string JBossAS in their name will be added to the group.
Pivoted expressions can also be used in compound expressions. Every line must have the groupby keyword, not only the first line.
groupby resource.type.plugin
groupby resource.type.name 
groupby resource.parent.name
A compound expression can contain both simple and pivoted expressions. This creates a compatible group for every unique server type on the platform.
resource.type.category = server 
groupby resource.type.plugin 
groupby resource.type.name 
groupby resource.parent.name
Lastly, compound expressions can include the empty and not empty keywords. For example, simple expressions can be used to identify JBoss servers based on the resource type and name. Then, to identify which JBoss servers are unsecured, the expression can filter for JBoss servers with a principal connection property with an empty value.
resource.type.plugin = JBossAS
resource.type.name = JBossAS Server
empty resource.pluginConfiguration[principal]

7.1.6. Unsupported Expressions

There are restrictions on how multiple expressions can be used together.

All expressions must be in the same configuration area.

All given configuration properties in an expression must be only from the resource configuration or only from the plug-in configuration. Expressions cannot be taken from both.

Each property must only be used once.

A property can only be used once in a dynagroup definition.

valid
resource.trait[x] = foo

not valid
resource.trait[x] = foo
resource.trait[y] = bar
For example, a resource.trait expression can only occur once in a definition:
resource.grandParent.trait[Trait.hostname].contains = stage
resource.parent.type.plugin = JBossAS5
resource.type.name = Web Application (WAR)
If it is used a second (or more) time, then the subsequent use fails and causes the definition not to be parsed.
resource.grandParent.trait[Trait.hostname].contains = stage
resource.parent.type.plugin = JBossAS5
resource.type.name = Web Application (WAR)
resource.trait[contextRoot] = jmx-console
This results in calculation errors:
There was a problem calculating the results: java.lang.IllegalArgumentException: org.hibernate.QueryParameterException: could not locate named parameter [arg2]
The [arg2] error is a sign that multiple expressions of the same type were used and that the second and subsequent expressions caused a calculation failure.
This is true even if the property type is used with different resource contexts.
resource.parent.trait[x] = foo
resource.grandParent.trait[y] = bar
In the previous example, one trait applied to the grandparent resource and one trait applied to the resource itself. This still failed because the trait property was used twice, even though it was for different resources.

7.1.7. Dynagroup Expression Examples

A single group definition can have multiple expressions, even mixing simple and pivoted expressions in a single definition. Many of these examples require multiple expressions to complete the definition.

Example 7.1. JBoss Clusters

resource.type.plugin = JBossAS 
resource.type.name = JBossAS Server 
groupby resource.trait[partitionName]

Example 7.2. A Group for Each Platform Type

resource.type.plugin = Platforms 
resource.type.category = PLATFORM 
groupby resource.type.name

Example 7.3. Autogroups

groupby resource.type.plugin 
groupby resource.type.name 
groupby resource.parent.name
Note
This could create a large number of groups in large inventories.

Example 7.4. Raw Measurement Tables

resource.type.plugin= Postgres 
resource.type.name = Table 
resource.parent.name = rhq Database 
resource.name.contains = rhq_meas_data_num_

Example 7.5. Only Agents with Multicast Detection

resource.type.plugin= RHQAgent 
resource.type.name = RHQ Agent 
resource.resourceConfiguration[rhq.communications.multicast-detector.enabled] =  true

Example 7.6. Only Windows Platforms with Event Tracking

resource.type.plugin= Platforms 
resource.type.name = Windows 
resource.pluginConfiguration[eventTrackingEnabled] =  true

Example 7.7. JBoss AS Servers by Machine

groupby resource.parent.trait[Trait.hostname] 
resource.type.plugin = JBossAS 
resource.type.name = JBossAS Server

7.2. Creating Dynamic Groups

  1. Click the Inventory tab in the top menu.
  2. In the Groups menu box on the left, click the Dynagroup Definitions link.
  3. Click the New button to open the dynamic group definition form.
  4. Fill in the name and description for the dynamic group. The name can be important because it is prepended to any groups created by the definition, as a way of identifying the logic used to create the group.
  5. Fill in the search expressions. This can be done by entering expressions directly in the Expression box or by using a saved expression.
    Saved expressions are have a wizard to help build and validate the expressions. To create a saved expression, click the button by the drop-down menu. Several options for the expression are active or inactive depending on the other selections; this prevents invalid expressions.
    The Expression box at the top shows the currently created expression.
  6. After entering the expressions, set whether the dynamic group is recursive.
  7. Set an optional recalculation interval. By default, dynamic groups do not recalculate their members automatically, meaning the recalculation value is set to 0. To recalculate the group membership, set the Recalculation interval to the time frequency, in milliseconds.
    Note
    Recalculating a group definition across large inventories could be resource-intensive for the JBoss ON server, so be careful when setting the recalculation interval. For large inventories, set a longer interval, such as an hour, to avoid affecting the JBoss ON server performance.

7.3. Recalculating Group Members

Dynamic groups can be recalculated apart from whatever interval is set in the group definition. The recalculation interval in the group definition is a relative value based on the last update time, so initiating a recalculation manually will not conflict or interfere with the normal recalculation; it simply proceeds after the specified amount of times elapses.
  1. Click the Inventory tab in the top menu.
  2. In the Groups menu on the left, click the Dynagroup Definitions link.
  3. In the list of dynagroups, select the row of the dynagroup definition to calculate.
  4. Click the Recalculate button at the bottom of the table.

Chapter 8. Creating User Accounts

Users are part of the overall security planning for JBoss ON, even though they don't have access controls set on their accounts individually.

8.1. Managing the rhqadmin Account

When JBoss ON is installed, there is a default superuser already created, rhqadmin. This superuser has the default password rhqadmin.
Note
The rhqadmin account cannot be deleted, even if other superuser accounts are created. Additionally, the role assignments for rhqadmin cannot be changed; it is always a superuser account.
Important
If a user is deleted, scheduled operations owned by the user are canceled.
When you first log into JBoss ON after installation, change the superuser password.
  1. Click the Administration tab in the top menu.
  2. In the Security table on the left, select Users.
  3. Click the name of rhqadmin.
  4. In the edit user form, change the password to a new, complex value.

8.2. Creating a New User

  1. Click the Administration tab in the top menu.
  2. In the Security table on the left, select Users.
  3. Click the NEW button at the bottom of the list of current users.
  4. Fill in description of the new user. The Enable Login value must be set to Yes for the new user account to be active.
  5. Select the required role from the Available Roles area, and then click the arrow pointing to the Assigned Roles to assign the role.
  6. Click the Save button to save the new user with the role assigned.

8.3. Editing User Entries

All users can edit their own account details, and users with administrative rights (who belong to a role which grants them rights over user entries) can edit other users' entries.
  1. Click the Administration tab in the top menu.
  2. From the Security menu, select Users.
  3. Click the name of the user whose entry will be edited.
  4. In the edit user form, change whatever details need to be changed, and save.

8.4. Disabling User Accounts

User accounts can be temporarily disabled. This can be done for a security review or when there is some kind of breach, but users don't need to be deleted. The Enable Login property can prevent the user from logging into the JBoss ON UI and managing resources or making configuration changes.
  1. Click the Administration tab in the top menu.
  2. In the Security table on the left, select Users.
  3. Click the name of the user whose entry will be edited.
  4. In the edit user form, change the Enable Login radio button to No.
  5. Click the Save button to save the new user with the role assigned.
The user account can be re-enabled at any time by changing the Enable Login value back to Yes.

8.5. Changing Role Assignments for Users

  1. Click the Administration tab in the top menu.
  2. From the Security menu, select Users.
  3. Click the name of the user to edit.
  4. To add a role to a user, select the required role from the Available Roles area, click the arrow pointing to the Assigned Roles area. To remove a role, select the assigned role on the right and click the arrow pointing to the left.
  5. Click Save to save the role assignments.

Chapter 9. Managing Roles and Access Control

In JBoss Operations Network, security is implemented through rules that are set on users and roles. Restrictions are set on what users and roles can access and what operations they can perform.

9.1. Security in JBoss ON

Security establishes precise relationships between users, resources, and the tasks users can perform. Interactions between users and resources are ordered by including or excluding those users and resources (through groups) in defined roles, and then granting the role the ability to perform tasks.

9.1.1. Access Control and Permissions

When a user is allowed to perform a certain operation, that is called a permission. All permissions must be explicitly granted to explicit resources. If a user is not given permission to a specific resource group, then the user, by default, has no access to that group — even if the user has permission to perform a task. Likewise, if a user has access to a group but has no permissions assigned, then the user cannot perform any tasks.
Any permissions set in JBoss ON are given to a role, and the members of the role inherit those permissions. Allowing or restricting permissions is access controls.
In JBoss ON, there are two levels where access control is granted:
  • Global permissions apply to JBoss ON server configuration. This covers administrative tasks, like creating users, editing roles, creating groups, importing resources into the inventory, or changing JBoss ON server properties.
  • Resource-level permissions apply to actions that a user can perform on specific resources in the JBoss ON inventory. These cover actions like creating alerts, configuring monitoring, and changing resource configuration. Resource-level permissions are tied to the subsystem areas within JBoss ON.
All JBoss ON permissions are listed in Table 9.1, “JBoss ON Access Control Definitions”.
For resource-level rights, read and write permissions are granted independently. A user can be granted a right to view (read) resource data without automatically being granted the right to edit that configuration. For example, any user can view the operations history of a resource or view the configured alerts for a resource within the role even if that role has not been given edit access to those subsystem areas.
By default, read access is enabled by default for all resource-level rights, with one exception: resource configuration. Resource configuration can be considered a security risk, so even read access is denied by default. Of course, read access, like write access, can be enabled or disabled for any resource-level permission.

Figure 9.1. Read Access Option

Read Access Option
Note
Granting a role the right to change something does not implicitly grant the right to delete something. For example, users with the configuration write permission can edit resource configuration and view configuration history and settings, but they cannot delete elements in the configuration history. Similar constraints are true for users with permission to create and edit operations and alerts — there is no right to delete elements in the resource history.
Deleting elements in the history requires the manage inventory permission.

Table 9.1. JBoss ON Access Control Definitions

Access Control Type Description
Global Permissions
Manage Security
Equivalent to a superuser. Security permissions grant the user the rights to create and edit any entries in JBoss ON, including other users, roles, and resources, to change JBoss ON server settings, and to control inventory.
Warning
The Security access control level is extremely powerful, so be cautious about which users are assigned it. Limit the number of superusers to as few as necessary.
Manage Inventory Allows any operation to be performed on any JBoss ON resource, including importing new resources.
Manage Settings Allows a user to add or modify any settings in the JBoss ON server configuration itself. This includes operations like deploying plug-ins or using LDAP authentication.
Manage Bundle Groups
Allows a user to add and remove members of a bundle group; implicitly, it includes the permission to view bundles. This is analogous to the Manage Inventory permission for resources.
Note
This permission is required for all bundle-level create, deploy, and delete permissions.
Deploy Bundles to Groups Allows a user to deploy a bundle to any resource group to which the user has access.
View Bundles Allows a user to view all bundles, regardless of the bundle group assignment.
Create Bundles Allows a user to create and update bundle versions. When a bundle is created, it must be assigned to bundle group, unless the user has the View Bundles permission; in that case, a user can create a bundle and leave it unassigned.
Delete Bundles Allows a user to delete any bundle which he has permission to view.
Manage Bundles (Deprecated)
Allows a user to upload and manage bundles (packages) used for provisioning resources.
This permission has been deprecated. It is included for backward-compatibility with older bundle configuration and user roles. However, this permission offered no ability to limit access to certain bundles, groups, or resources (for deployment); without this fine-grained control, this permission could only be applied to high-level administrators to maintain security.
Manage Repositories Allows a user to access any configured repository, including private repositories and repositories without specified owners. Users with this right can also associated content sources with repositories.
View Users Allows a user to view the account details (excluding role assignments) for other users in JBoss ON.
Resource-Level Permissions
Inventory Allows a user to edit resource details and connection settings — meaning the information about the resource in the JBoss ON inventory. This does not grant rights to edit the resource configuration.
Manage Measurements Allows the user to configure monitoring settings for the resource.
Manage Alerts Allows the user to create alerts and notifications on a resource. Configuring new alert senders changes the server settings and is therefore a function of the global Settings permissions.
Control Allows a user to run operations (which are also called control actions) on a resource.
Configure
Allows users to change the configuration settings on the resource through JBoss ON.
Note
The user still must have adequate permissions on the resource to allow the configuration changes to be made.
This access area has two options:
  • Read, which grants read-only access to the resource configuration
  • Write, which grants modify access and, implicitly, read access
If one of these permissions is not granted to a role, then the users in the role are denied any access to the resource configuration.
Manage Drift Allows the user to create, modify, and delete resource and template drift definitions. It also allows the user to manage drift information, such as viewing and comparing snapshots.
Manage Content Allows the user to manage content providers and repositories that are available to resources.
Create Child Resources Allows the user to manually create a child resource for the specified resource type.
Delete Child Resources Allows the user to delete or uninventory a child resource for the specified resource type.
Bundle-Level Permissions
Assign Bundles to Group Allows a user to add bundles to a group. For explicit bundle groups, this is the only permission required. To add bundles to the unassigned group (which essentially removes it from all group membership), this also requires the global View Bundles permission.
Unassign Bundles from Group Allows a user to remove bundles from a group.
View Bundles in Group Allows a user to view any bundle within a group to which the user has permissions.
Create Bundles in Group Allows a user to create a new bundle within a bundle group to which he has permission. This also allows a user to update the version of an existing bundle within the bundle group.
Delete Bundles from Group Allows a user to delete both bundle versions and entire bundles from the server, so long as they belong to a group to which the user has permissions.
Deploy Bundles to Group Allows a user to deploy any bundle which he can view (regardless of create and delete permissions) to any resource within a resource group to which he has permissions.

9.1.2. Access and Roles

JBoss ON handles access to both resources and JBoss ON configuration through roles. A role has certain permissions assigned to it, meaning things that members of the role are allowed to do.
Only two types of JBoss ON identities can belong to a role: users and groups.
Users are assigned to a role to be granted those permissions. Users can be added to a role individually or be added as a member of an LDAP group.
Resource groups are assigned to a role to provide a list of resources that those users can perform actions on. Another way of looking at it is that users can only manage resources that they are expressly given access to. Roles define that access.
Note
Be sure to create resource groups to assign to any custom roles you create. If no resource groups are assigned to a role, then none of the members of the role can see any resources. Creating groups is described in Section 6.2, “Creating Groups”.
By default, users created without roles are able to login. This setting can be changed by clicking on "Administration" in the top menu, going to the "System Settings" section in the "Configuration" table, and updating the "Enable Login Without Roles" setting.
One convenient feature of roles is that users can be automatically assigned to roles by assigning an LDAP group to the role (Section 10.3.2, “Associating LDAP User Groups to Roles”). All of the LDAP users who belong to that group are automatically members of the role. (This is similar to the simplicity of using LDAP user to create JBoss ON users by enabling LDAP authentication, in Section 10.2.3, “Configuring LDAP User Authentication”.)
There are two roles already configured in JBoss ON by default:
  • A superuser role provides complete access to everything in JBoss ON. This role cannot be modified or deleted. The user created when the JBoss ON server was first installed is automatically a member of this role.
  • An all resources role exists that provides full permissions to every resource in JBoss ON (but not to JBoss ON administrative functions like creating users). This is a useful role for IT users, for example, who need to be able to change the configuration or set up alerts for resources managed by JBoss ON but who don't require access over JBoss ON server or agent settings.

9.1.3. Access and Groups

A role is essentially the intersection of four elements: users, resource groups, bundle groups, and the permissions allowed to those groups.
Access defines a relationship. The most direct relationship is with the users: a role allows a user to do something to some resource within JBoss ON. However, access in roles also define relationships between groups.
Group relationships function in two ways. First, it can grant different types of permission to a main group and a subgroup, such as granting view access to all resources and restricting configuration access to just some. It can also grant different types of access to two distinct groups.
Group relationships grow more complex when there are different types of groups involved, both resource groups and bundle groups. While related to a very specific type of task — content or application deployment — there are a number of different potential interactions, depending on different access to resources, to bundles, and to operation points within the lifecycle (create, deploy, delete).
For most infrastructures, users will not belong to a single role, and a single role probably cannot define all of the interactions to perform a set of tasks. Rather, roles are like a Venn diagram, which overlap each other to create the function list of access rules.

Two Roles to Define Access for a Single User to Resources and Bundles

Bundle Group A                  Resource Group A
     |                                 |
     V                                 V
  Role 1   <---  User A  --->    Role 2
     ^                                 ^
     |                                 |
  Permissions                     Permissions
   - view bundles in group         - deploy bundles to group
   - create bundles

9.2. Creating a New Role

Note
Be sure to create resource groups to assign to any custom roles you create. If no resource groups are assigned to a role, then none of the members of the role can see any resources. Creating groups is described in Section 6.2, “Creating Groups”.
  1. Create any resources groups which will be associated with the role. Creating groups is described in Section 6.2, “Creating Groups”.
    By default, JBoss ON uses only resource groups to associate with a role, and these are required. However, optional user groups from an LDAP directory can also be assigned to a role, so that the group members are automatically treated as role members. LDAP groups must be configured in the server settings, as described in Section 10.3.2, “Associating LDAP User Groups to Roles”.
  2. In the top menu, click the Administration tab.
  3. In the Security menu table on the left, select the Roles item.
  4. The list of current roles comes up in the main task window. Click the New button at the bottom of the list.
  5. Give the role a descriptive name. This makes it easier to manage permissions across roles.
  6. Set the access rights for the role in the Permissions. There are two categories of permissions:
    • Global permissions grant permissions to areas of the JBoss ON server and configuration.
    • Resource permissions grant permissions for managing resources.
    The specific access permissions are described in Section 9.1.1, “Access Control and Permissions”.
  7. Select the Resource Groups tab to assign groups to the role.
    Move the required groups from the Available Resource Groups area on the left to the Assigned Resource Groups on the right as required.
  8. At the bottom, click the Save button.
  9. Select the Users tab to assign users to the role.
    Move the required user from the Available Users area on the left, to the Assigned Users on the right as required.
  10. Click the arrow in the upper right to close the create window.

9.3. Extended Example: Read-Only Access for Business Users

JBoss ON distinguishes between read permissions and write permissions. Neither read nor write access is implicit to any object or functional area in JBoss ON, which gives administrators some flexibility in where and what access is granted.

The Setup

Example Co. needs some of its management team to be able to read and access JBoss ON data to track infrastructure performance and maintenance, define incident response procedures, and plan equipment upgrades. While these business users need to view JBoss ON information, they should not be able to edit any of the configuration, which is handled by the IT and development departments.

Tim the IT Guy plans to create a special business managers role that will allow users to "see but not touch" the resource configuration.

The Plan

Tim the IT Guy first defines what actions the business users need to perform, and they need to be able to see everything:

  • View resources in the inventory and histories for adding and deleting resources.
  • View monitoring information, including measurements and events.
  • View alerts.
  • View content and bundles and any deployments to resources.
  • View configuration drift.
  • View all resource histories for configuration and operations.
  • View user details to get information for auditing actions.
All of the global permissions relate to creating entries and managing configuration in JBoss ON and the inventory — which none of the business managers need to be able to do. There is one exception: the view users permission, which allows regular users to see the details of other users. That is necessary, because many actions such as running operations and changing resource configuration list what user initiated the action. Being able to view user information is required for adequate auditing of infrastructure changes.
The default selection for roles is for all resource-level permissions to grant read access to users, with the exception of configuration rights, which have no access. Tim the IT Guy decides to grant read-only access to the configuration so that managers can check the configuration history, which could be useful for policy planning. The group has read-only access to all resources and to items like reports.

The Results

Business users are given access to all of the information they need, without being able to change any configuration or inventory accidentally.

9.4. Extended Example: View All Resources, Edit Some Resources

For security and for management practices, access should be limited to what is necessary for a given user. Security starts at restricting access as the default and then explicitly granting defined rights (read, modify, delete) to specific resources.
While it is possible to create a set of access control rules which define everything that is required for a user (or set of users), these rules in practice are cumbersome, complicated, and easily outdated or inaccurate. Access controls are a definition of a relationship; frequently, the different relationships that a user has are too complex to be defined in a single role.
The effect of roles is cumulative. The total level of access for a user is the sum of all of the roles they belong to — access to resources in all of the specified resource groups, the permissions granted to those roles.
Because the effects are cumulative, it is most effective to design access controls in a layered way, using small roles that define a very specific set of access and then adding those roles to users incrementally.
This layered approach allows administrators to define and effectively maintain complex relationships, both at a macro-level along personnel and infrastructure divisions and at a micro-level with different relationships to and between different resources.

The Setup

Example Corp. has three major groups associated with its IT infrastructure: development, QE, and production. Each group requires information from the other teams to help maintain their configuration, manage performance settings, and roll out new applications, but they should only be able to manage their own systems.

Within each group, there are two separate application management tasks: updating and deploying new versions of the application and managing system configuration for optimal performance.

The Plan

Tim the IT Guy first defines the different relationships that need to be expressed within the access controls:

  • Everyone needs to be able to view performance data for all stacks within the infrastructure.
  • The individual divisions need write access to their own systems.
  • At least some administrators within each group require the ability to update system configuration.
  • At least some administrators within each group require the ability to create and deploy bundles to manage applications within their own groups.
The first step is to define the required groups, with the minimum required resources in each group. In a simplified structure:
  • A mixed group which contains all of the resources within each given stack environment. The stacks include platforms, Postgres databases, EAP servers, web contexts, and other resources used to manage the production environment.
    This results in three groups: Dev Stack, QE Stack, and Production Stack.
  • An "all stacks" nested group which includes all three stack groups.
    This group is not for all resources — it only includes the stack groups, excluding JBoss ON-related resources and other managed resources not relevant to those stacks.
  • Since these environments include application development, each organization also requires its own bundle group to maintain deployments.
  • There has to be a mechnism to promote bundles between environments. Tim the IT Guy creates "Promote Bundles" group where bundles can be added when they are ready to be moved into a different environment.
Each of the resource and bundle groups could be broken down, if different users within the same division require different levels of access to individual resources.
Then, different users require different kinds of access. Tim the IT Guy then maps the different permissions that are required:
  • View-only rights to all resources, including configuration view-only rights
  • Edit rights to resources within the stacks for monitoring, alerts, drift, operations, and inventory
  • Edit rights to resources within the stacks for configuration
  • View bundle rights within the stacks
  • Create and deploy bundle rights within the stacks
There are essentially three types of users within each functional group:
  • Regular users
  • Administrators which manage resource configuration
  • Administrators which can create (promote) bundles between groups
The roles are going to corrspond to each resource and bundle group with permissions set for the minimal requirements for that group (local view and edit). For the administrators, the additional permissions — configuration and promoting content — are going to be added by creating additional roles with only the additional permissions.
For a regular user, he adds roles for all resources and bundles and resources within his stacks.
       Dev Stack 
      Bundle Group
           |
       Role BG1
           |
           V
     Regular Joe
      ^         ^
      |         |
   Role RG1  Role RG2                      
      |         |
 "All Stack"   Dev Stack  
  Resource     Resource
  Group        Group
For the "All Stack" role, he adds read-only permission for all inventory areas and configuration.
      ^
      |      
      Role RG1 <------Permissions        
      |                     |
 "All Stack"              View.alerts
  Resource                View.inventory
  Group                   View.measurements
                          View.etc...
                          View.configuration
For the stack group, Tim the IT Guy sets permissions to edit every functional area (measurements, alerts, operations, events) except for configuration, which is reserved for administrator users. The resource group also includes the permission to deploy bundles to the resources in the group. The ability to deploy bundles is separate from the ability to create bundles.
      ^
      |      
      Role RG2 <------Permissions        
      |                     |
  Dev Stack              Edit.alerts
  Resource               Edit.inventory
  Group                  Edit.measurements
                         Edit.etc...
                         Deploy.bundles
Last, he adds permissions to view and create bundles within the regular user's bundle group.
 Dev Stack 
      Bundle Group
           |
       Role BG1 <-----Permissions
           |                |
    V             View.bundles
                         Create.bundles
In this configuration, an administrator has two extra tasks: managing resource configuration within their stacks and promoting bundles to the next work group. The administrator is added to all of the roles of the regular user, plus additional roles for the additional tasks.
Tim the IT Guy creates one additional role to define the configuration permission. This grants configuration editing only to resource groups the administrator can see (the ones in his work groups).
"Regular Joe" roles
        |
        V
   Group Lead <------Role RG3
                            |
                        Permissions
                            |
                     Edit.configuration
Last, each administrator is added to the role for each bundle group. The bundle group roles only grant access to the bundle group to create content, not to view, deploy, or delete it. This allows administrators to promote content between work groups, but not to deploy it or affect resource configuration.
                Dev Stack          Permission:
             Bundle Group          Create.Bundles
                        \          /
                         \        / 
                          Role BG1
                              |
                              V
       Role BG2 ---->    Group Lead    <---- Role BG3 
      /      \                               /     \
     /        \                             /       \
 QE Stack      Permission:        Prod Stack         Permission: 
Bundle Group   Create.Bundles   Bundle Group         Create.Bundles

The Result

Users within each group are granted access to view whatever performance and configuration information they need, but they can only make changes to resources within their specified group. Only administrators within each group can make configuration changes.

Application deployment is limited within each functional area (development, QE, and production). Specific administrators which are allowed to create content in other groups, but not to deploy it.

Chapter 10. Integrating LDAP Services for Authentication and Authorization

JBoss ON can incorporate LDAP directories to help manage users, authentication, and membership in roles. This simplifies user management in JBoss ON and also leverages existing organizational configuration (user accounts, groups, passwords, and account lockout policies) so that JBoss ON mirrors other infrastructure configuration.
Important
If LDAP is used for user account management, then the LDAP directory should be the authoritative source for creating and managing user accounts. Otherwise, there can be inconsistencies in role memberships, account settings, or other user account conflict. See Section 10.2.2, “Issues Related to Using LDAP for a User Store”.
Important
If a multi-domain Active Directory structure is used, Universal (not Global) Groups are required. Users in Global groups have limited visibility across domains due to Active Directory privilege issues.

10.1. Supported Directory Services

JBoss ON supports major directory servers for user authentication and group authorization:
  • Red Hat Directory Server 8.1, 8.2, and 9.0
  • Microsoft Active Directory 2003 and 2008

10.2. LDAP for User Authentication

10.2.1. About LDAP Authentication and Account Creation

By default, JBoss ON stores authentication information in its internal database. JBoss ON can also use an external LDAP repository to store this user information. With LDAP authentication, the JBoss ON server sends all login requests to the LDAP directory to process.
First, the JBoss ON server searches the LDAP directory for a matching username, and then it attempts to log into (bind to) the LDAP server using the given username and password. If the bind attempt is successful, then the user is successfully authenticated to the JBoss ON server.
After the JBoss ON server is configured to use LDAP for authentication, all login attempts are authenticated against the LDAP server.
Warning
When the JBoss ON server is reconfigured to use LDAP for authentication, the LDAP information isn't validated yet. Any errors with the LDAP authentication configuration won't show up until a user attempts to log into the UI.
Note
The LDAP directory can't create JBoss ON users automatically. However, using LDAP for authentication allows new users to register themselves to JBoss ON. A new user can authenticate to JBoss ON as long as they have an LDAP account. At their first login attempt, they're redirected to a registration page which records the additional JBoss ON user information.
The JBoss ON server constructs the LDAP entry name to look for based on the JBoss ON username and information about the LDAP directory, like the parent distinguished name in the directory tree and the naming attribute used for user entries; from there, it dynamically constructs a search filter every time someone logs into JBoss ON. Custom attributes can be added to the LDAP schema, such as JONUser=true, which can make it easier and more precise to locate entries.
The LDAP directory only verifies the login credentials. The LDAP server doesn't store any other JBoss ON user data, and it doesn't create, delete, or edit entries in JBoss ON. Likewise, JBoss ON doesn't create, delete, or edit entries in the LDAP directory. The only attributes in the LDAP database that relate to JBoss ON user accounts are the username and password. Other settings in the JBoss ON user entry are stored in the JBoss ON internal database (like the user's first name and surname, email address, and role assignments).
Note
The LDAP directory is used only to check the login credentials — it doesn't store any other information about the JBoss ON users, including role assignments, and it cannot create a JBoss ON user. The JBoss ON server also cannot create LDAP users, so the LDAP user has to be created separately.
Because the LDAP directory doesn't store other attributes related to JBoss ON, it can't store a user certificate. This means that JBoss ON cannot use an LDAP directory for certificate-based authentication.

10.2.2. Issues Related to Using LDAP for a User Store

Integrating LDAP directories introduces another area where users can be created and managed and where the membership of roles can be changed. On the one hand, this can make managing users much easier, especially by allowing existing users to register themselves seamlessly and by automatically updating role membership. However, because users can still be created in JBoss ON and added manually to JBoss ON, user and role management can become messy.
The first problem is simply determining which datastore to use to authenticate users. Even after LDAP authentication is enabled, JBoss ON still checks credentials against its own user store — and it checks its own database first. This means that a user can authenticate to JBoss ON without that request being sent to the LDAP database. All of the security features of the LDAP directory — particularly password policies and account inactivation — are lost because that is not the primary authentication mechanism.
Second, using two resources for user accounts introduces the problem of erroneously mapping JBoss ON and LDAP user accounts, creating duplicate entries, or allowing ghost entries. For example, John Smith is added as a user manually to JBoss ON and also has an LDAP user account. First, he has two duplicate, separately-managed user entries. Then, John Smith goes to a different division, and his LDAP entry is automatically deleted, but his JBoss ON user account remains because JBoss ON user accounts and LDAP user accounts aren't linked. He could still log into JBoss ON. Having duplicate user accounts can introduce other problems if there are accounts with identical names. For example, Jane Smith logs into JBoss ON with her JBoss ON user account (jsmith) and password, but is improperly assigned the JBoss ON role membership of LDAP user John Smith (LDAP UID jsmith) because her JBoss ON user ID was the same as his LDAP user ID, and her account was incorrectly mapped to his LDAP account and, therefore, his LDAP group membership.
Trying to maintain user accounts in both locations also impacts roles, at least in an administrative way. LDAP groups are added as members to the role, and then the group members are listed as user members for the role. However, the list of users assigned to the role does not show where those users came from. This means that the user list can be a mix of LDAP group members and JBoss ON members who were manually added to the list. Ultimately, it becomes difficult to add or remove users because it's not clear where the role users are coming from. Limiting role membership to LDAP groups simplifies maintenance; the roles are automatically updated when users are added or deleted to the groups on the LDAP side and those changes are synchronized over to the JBoss ON role dynamically.

Figure 10.1. LDAP Groups, JBoss ON Roles, and Role Members

LDAP Groups, JBoss ON Roles, and Role Members
What all of this means is that there are three things to consider when using LDAP services for authentication or authorization:
  • Only create regular user accounts in one place. If LDAP should be used for authentication, then only add or delete user accounts in the LDAP directory.
  • Ideally, limit JBoss ON user accounts to special, administrative users and rely on the LDAP directory for regular accounts.
  • Try to design roles around LDAP groups, meaning that JBoss ON user accounts in those roles should be limited to admin accounts or avoided altogether.

10.2.3. Configuring LDAP User Authentication

Authentication is the process of verifying the identity of an entity who attempts to access a server. JBoss ON uses simple authentication, meaning it uses simple username-password pairs to verify identity.
  1. In the top menu, click the Administration tab.
  2. In the Configuration menu table on the left, select the System Settings item.
  3. Jump to the LDAP Configuration Properties area.
  4. Check the Use LDAP Authentication checkbox so that JBoss ON will use the LDAP user directory as its identity store.
  5. Configure the connection settings to connect to the specific LDAP directory.
    • Give the LDAP URL of the LDAP server. This has the format ldap://hostname[:port]. For example:
      ldap://server.example.com:389
      By default, this connects to the localhost over port 389 (standard LDAP port) or 636 (secure LDAP port, if SSL is selected).
    • To use a secure connection, check the Use SSL checkbox. When using SSL, make sure that the LDAP directory is actually running over SSL, and make sure that the connection URL points to the appropriate SSL port and protocol:
      ldaps://server.example.com:636
    • Give the bind credentials to use to connect to the server. The username is the full LDAP distinguished name of the user as whom JBoss ON binds to the directory.
      Note
      The user must exist in the LDAP directory before configuring the LDAP settings in JBoss ON. Otherwise, login attempts to the JBoss ON server will fail.
      Also, make sure that the JBoss ON user has appropriate read and search access to the user and group subtrees in the LDAP directory.
      By default, users created without roles are able to login. For more information on roles see Section 9.1.2, “Access and Roles”.
      Note
      By default, users created without roles are able to login. This has an impact since users may exist in LDAP but do not have an assigned role in JBoss ON. For more information on roles see Section 9.1.2, “Access and Roles”.
  6. Set the search parameters that JBoss ON uses when searching the LDAP directory for matching user entries.
    • The search base is the point in the directory tree where the server begins looking for entries. If this is used only for user authentication or if all JBoss ON-related entries are in the same subtree, then this can reference a specific subtree:
      ou=user,ou=JON,dc=example,dc=com
      If the users or groups are spread across the directory, then select the base DN:
      dc=example,dc=com
    • Optionally, set a search filter to use to search for a specific subset of entries. This can improve search performance and results, particularly when all JBoss ON-related entries share a common LDAP attribute, like a custom JonUser attribute. The filter can use wild cards (objectclass=*) or specific values (JonUser=true).
    • Set the LDAP naming attribute; this is the element on the farthest left of the full distinguished name. For example, in uid=jsmith,ou=people,dc=example,dc=com, the far left element is uid=jsmith, and the naming attribute is uid.
      The default naming attribute in Active Directory is cn. In Red Hat Directory Server, the default naming attribute is uid.
  7. Save the LDAP settings.
    Note
    The Group Filter and Member Property fields aren't required for user authentication. They're used for configuring LDAP groups to be assigned to roles, as in Section 10.3.2, “Associating LDAP User Groups to Roles”.

10.3. Roles and LDAP User Groups

10.3.1. About Group Authorization

Many LDAP directories already contain organizational groups with users who will need to access resources in JBoss ON. Configuring JBoss ON to connect to these directories allows JBoss ON to assign LDAP groups to roles and then pull in those member lists dynamically, so the roles are populated with pre-existing member lists. All of the LDAP users automatically inherit the permissions of that role.
In the role details page, these LDAP user groups are separated from the resource groups, so it's easy to distinguish which types of group are being added to the role.

Figure 10.2. Groups Assigned to a Role

Groups Assigned to a Role
JBoss ON determines what LDAP groups a user belongs to with a simple search. Whenever a user logs into JBoss ON and an LDAP connection is configured, JBoss ON maps that JBoss ON username to a user entry in the LDAP directory server. The specific LDAP distinguished name (DN) for the user is used as part of a search to find matching member attributes in LDAP group entries. That is, the LDAP server can check the member lists in group entries to see what groups the person with that DN belongs to.
For LDAP groups to be added to roles, three things are required:
  • An LDAP directory server connection has to be configured.
  • There has to be an LDAP attribute given to search for group entries.
    For Active Directory, this is generally the group object class. For Red Hat Directory Server, this is generally groupOfUniqueNames. Other standard object classes are available, and it is also possible to use a custom, even JBoss ON-specific, object class.
  • There has to be an LDAP attribute given to identify members in the group.
    Common member attributes are member and uniqueMember.
JBoss ON constructs an LDAP search based on the group object class and member attribute in the server configuration, plus the DN of the user given when the user logs in.
(&(group_filter)(member_attribute=user_DN))
For example, this looks for the member attribute on an Active Directory group:
ldapsearch -h server.example.com -x -D "cn=Administrator,cn=Users,dc=example,dc=com" -W -b "dc=example,dc=com" -x &apos;(&amp;(objectclass=group)(member=CN=John Smith,CN=Users,DC=example,DC=com))&apos;
Red Hat Directory Server uses the uniqueMember attribute on groupOfUniqueNames groups more commonly than member and group. For example:
/usr/lib64/mozldap6/ldapsearch -D "cn=directory manager" -w secret -p 389 -h server.example.com -b "ou=People,dc=example,dc=com" -s sub &quot;(&amp;(objectclass=groupOfUniqueNames)(uniqueMember=uxml:id=jsmith,ou=People,dc=example,dc=com))&quot;
This search returns a list of all groups to which the user is a member. If any of these LDAP groups is assigned to a JBoss ON role, then that user is also automatically a member of that JBoss ON role.
Note
Using custom LDAP group object classes can allow you to be very specific about which groups to use for JBoss ON roles.

10.3.2. Associating LDAP User Groups to Roles

Section 10.2.3, “Configuring LDAP User Authentication” describes how an LDAP directory server can be used to authenticate users to JBoss ON. Any time a user attempts to log in, that request — with the username and password — is simply forwarded to the specified LDAP directory server to see if the credentials are correct.
Members of LDAP groups can be pulled in, automatically, as members of JBoss ON roles. The LDAP group is associated with a JBoss ON role and then the group members are authorized to do whatever the JBoss ON role is configured to allow. Any changes made in the LDAP group members are automatically reflected in JBoss ON, without having to edit the JBoss ON role.
  1. In the top menu, click the Administration tab.
  2. In the Configuration menu table on the left, select the System Settings item.
  3. Jump to the LDAP Configuration Properties area.
  4. Set up the LDAP connections, as described in Section 10.2.3, “Configuring LDAP User Authentication”. It is not required that the LDAP directory be used as the identity store in order to configure LDAP authorization, but it is recommended.
  5. Set the parameters to use for the server to use to search for LDAP groups and their members.
    The search filter that JBoss ON constructs looks like this:
    (&(group_filter)(member_attribute=user_DN))
    • The Group Search Filter field sets how to search for the group entry. This is usually done by specifying the type of group to search for through its object class:
      (objectclass=groupOfUniqueNames)
    • The Group Member Filter field gives the attribute that the specified group type uses to store member distinguished names. For example:
      uniqueMember
    The user_DN is dynamically supplied by JBoss ON when a user logs into the UI.
  6. Save the LDAP settings.

10.4. Extended Example: memberOf and LDAP Configuration

The Setup

Authentication is the process of verifying someone's identity. Authorization is the process of determining what access permissions that identity has. Users are authorized to perform tasks based on the permissions granted to their role assignments.

All of the users and identities for Example Co. are stored in a backend Red Hat Directory Server LDAP database. To maintain a single, central user store, Tim the IT Guy wants to use existing LDAP users in JBoss ON and to determine user access to JBoss ON based on group membership, so, fundamentally, both authentication and authorization rules are determined by the LDAP configuration.

The Plan

There are two things to configure: how to identify users for authentication and how to organize users for authorization.

Groups are going to play a two-fold part in managing the LDAP configuration for JBoss ON for Example Co.:
  • A single group to identify JBoss ON users in the LDAP directory
  • Multiple, existing LDAP groups which are used to determine different levels of access to JBoss ON
The first thing that Tim the IT Guy determines is the way to identify users. As Section 10.2.3, “Configuring LDAP User Authentication” describes, JBoss ON identifies users to authenticate based on the results of an LDAP search, which uses a search base and optional search filter. The search filter specifies an attribute=value pair. One recommended method for identifying users is to create custom schema elements, like JONUser, which make it easy to search for matching users.
However, Tim the IT Guy has limited administrative access to the Red Hat Directory Server database. He has the ability to create groups and manage membership, but he cannot edit the schema. With no way to create an attribute that flags JBoss ON users, Tim the IT Guy has to use other configuration. Depending on the layout of the directory, he can use other kinds of configuration: views, manager, a class of service (CoS) virtual attribute, or group membership.
Using group membership is a good way to manage user assignments easily and dynamically while only having to manage a single entry (instead of individual group entries). In Directory Server, the memberOf attribute is automatically added to user entries to indicate a group that the user belongs to.
What Tim the IT Guy can do is set up a special group for all JBoss ON users, and then whatever users he likes. Because the Directory Server automatically adds and removes the memberOf attribute to user entries as members are added and removed to the group. Tim the IT Guy only has to use the memberOf attribute on those user accounts as the search filter for authentication.
dn: uid=jsmith,ou=people,dc=example,dc=com
uid: jsmith
cn: John Smith
...
memberOf: cn=JON User Group,ou=groups,dc=example,dc=com
memberOf: cn=IT Administrators,ou=groups,dc=example,dc=com
The JBoss ON LDAP authentication search filter, then, would target the memberOf attribute for that specific JBoss ON group:
memberOf='cn=JON User Group,ou=groups,dc=example,dc=com'
Using groups for access control requires an entirely different set of group definitions, which do not have to be JBoss ON-specific. These groups relate to functional areas within Example Co., and Tim the IT Guy can map existing LDAP groups to JBoss ON roles. There are three relevant LDAP groups for Example Co. for managing JBoss ON:
  • IT Administrators Group is mapped to a role with manage inventory permissions.
  • IT Manager Group is mapped to a role with view (but no write) permissions for all of the resources and with view users permissions.
  • Business Manager Group is mapped to a role with permissions to read all resource configuration, bundles, drift, measurements, operations, and alerts, but no write permissions.

The Results

Tim the IT Guy only has to create and manage one LDAP group, the JON Users Group, to set up all authentication and users for JBoss ON. He does not have to change the LDAP schema or even modify user entries directly.

For authorization, Tim the IT Guy designs JBoss ON roles around the functional groups already defined in the LDAP directory, in Example Co.'s organization, groups for IT admins, IT managers, and business managers and the level of access each requires.
As LDAP users authentication to the JBoss ON UI for the first time, they set up their own JBoss ON user details. After authenticating, they are automatically granted the appropriate level of access based on their LDAP group membership.

Part II. Managing Resource Configuration

Chapter 11. Executing Resource Operations

11.1. Operations: An Introduction

JBoss Operations Network provides a way to manage resources by scheduling and launching operations. Operations are basic management tasks. The available tasks differ for every different type of resource.
The Resource Reference: Monitoring, Operation, and Configuration Options contains a complete reference for all of the operations that can be scheduled for each resource type, as well as configurable parameters for the operations. Regardless of the type of operation or resource, the process for scheduling operations is similar for both resources and compatible groups in JBoss ON.
JBoss Operations Network allows administrators to manage resources by scheduling and launching operations. Operations manage resources by initiating or even scheduling some basic, specified tasks, such as restarting a server or running a script. Operations can be carried out on any resource in the inventory, and even on the JBoss ON agent themselves. The types of operations that are available for each resource depends on the type of resource being managed. For example, a JBoss AS server has different available operations than a cron service. The supported operations for a resource are defined by its agent plug-in, and the default operations are listed for each resource type in the Resource Reference: Monitoring, Operation, and Configuration Options.

11.1.1. A Summary of Operation Benefits

Operations provide a way to perform tasks in a consistent way, with a defined order both on resources and in task queuing, and in a way that can be tracked. Because operations are defined by plug-ins, they are extensible. The versatility of running specific tasks through JBoss ON provides several benefits to administrators:
  • They allow additional parameters (depending on how the operation is defined in the plug-in), such as command arguments and environment variables.
  • They validate any operation parameters, command-line arguments, or environment variables much as JBoss ON validates resource configuration changes.
  • They can be run on group of resources as long as they are all of the same type.
  • Operations can be ordered to run on group resources in a certain order.
  • They can be run on a recurrently schedule or one specific time.
  • Operations keep a history of both successes and failures, so that it is possible to audit the operations executed on a resource both for operations run for that specific resource and done on that resources as part of a group.

11.1.2. About Scheduling Operations

An operation schedule is the defined time when that operation can be run, including immediately.
There are two paths to schedule an operation:
  • Using the calendar setting to select a time. There are three different ways to schedule an operation using the calendar: immediately, at a set point in the future, or on a recurring schedule. The recurring schedules can be indefinite or run within a specific time period.
  • Using a cron expression. This is used almost exclusively for recurring jobs and can be used to set very complex execution schedules.

Figure 11.1. A Scheduled Operation

A Scheduled Operation
Note
The Schedules tab shows a list of scheduled operations, meaning operations which are configured but have not yet been run. [2]
When an operation is scheduled, a new operation is added to the history record for the resource, and its state is set to in progress. A message is sent to the agent telling it to invoke a specific operation on a particular resource with the arguments that were specified when the schedule was created. The agent queues operations so that only one operation is executed on the resource at a time.
When an operation completes, its raw output is sent back to the agent's resource plug-in, which ultimately parses the output and then generates an appropriate response message. This response message is then sent to the server.
If one operation ever hangs on a resource, then it blocks any other operations from being initiated because only one operation can be run on a resource at a time. Using a timeout setting for the operation enables the agent to kill the hung operation and run other queued operations.

11.1.3. About Operation Histories

The operation history is essentially an audit trail for tasks performed on the resource through JBoss ON.
Each entry shows the name of the user who scheduled the operation, the time the operation was scheduled to run, its status, and any results. For any failures, there is an informational message which contains the error message returned.
Scheduled operations and recent operations are listed in portlets in the Dashboard and in JBoss ON reports. These operation lists include both resource and compatible group operations.
As with individual resource operations, group operations also record a history for the group. These operation histories are really a list of the same operation being run on all of the resource in the group, so the operation history shows first a summary of the scheduled group operation and then the execution details for each individual resource.

11.2. Managing Operations: Procedures

11.2.1. Scheduling Operations

  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Operations tab.
  4. In the Schedules tab, click the New button.
    The types of operations that are available vary, depending on the specific type of resource.
    Note
    The Schedules tab shows a list of scheduled operations, meaning operations which are configured but have not yet been run. If there are no scheduled operations, then the tab has a description that reads No items to show. That does not mean that there are no operations available for the resource; it only means that no operations have been scheduled.
  5. Fill in all of the required information about the operation, such as port numbers, file locations, or command arguments.
  6. In the Schedule area, set when to run the operation.
    When using the Calendar, the operation can run immediately, at a specified time, or on a repeatable schedule, as selected from the date widget.
    The Cron Expression is used for recurring jobs, based on a cron job. These expressions have the format min hour day-of-month month day-of-week, with the potential values of 0-59 0-23 1-31 1-12 1-7; using an asterisk means that any value can be set.
  7. Set other rules for the operations, like a timeout period and notes on the operation itself.
  8. Click the Schedule button to set up the operation.
If the operation is scheduled to run immediately, the results are available in the History subtab as the operation is in progress and then completes. If it was scheduled on a later date or with a recurring schedule, then the operation is listed in the Schedules subtab.

11.2.2. Viewing the Operation History

Note
A user may have the write to schedule and edit an operation, but that does not mean that the user has the right to delete an item from the operation history.
Deleting elements in the history requires the manage inventory permission.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Operations tab.
  4. Click the History subtab.
    Every completed or in progress operation is listed, with an icon indicating its current status.
  5. Click the name of the operation to view further details. The history details page shows the start and end times of the operation, the stdout output of the operation or other operation messages, as well as the name of the user who scheduled the operation.

11.2.3. Canceling Pending Operations

  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Operations tab.
  4. In the Schedules tab, click the line of the operation to cancel.
  5. Click Delete, and confirm the action.
Note
Once the agent has started an operation it cannot be canceled. If the user attempts to cancel an operation currently in progress, the request will be ignored.

11.2.4. Ordering Group Operations

Group operations can be scheduled. This is useful when operations need to be performed in a particular order.
Note
This procedure assumes groups are already set up.
  1. In the Inventory tab in the top menu, select the Compatible Groups item in the Groups menu on the left.
  2. Click the name of the group to run the operation on.
  3. Configure the operation, as in Section 11.2.1, “Scheduling Operations”.
  4. In the Resource Operation Order area, set the operation to execute on all resources at the same time (in parallel) or in a specified order. If the operation must occur in resources in a certain order, then all of the group members are listed in the Member Resources box, and they can be rearranged by dragging and dropping them into the desired order.
    Optionally, select the Halt on failure checkbox to stop the group queue for the operation if it fails on any resource.

11.2.5. Running Scripts as Operations for JBoss Servers

JBoss ON auto-discovers resource scripts when the resource is discovered. Scripts can be managed just like any other resource to perform operations. There are three types of scripts that JBoss ON discovers, depending on the operating system:
  • .bat for Windows
  • .sh for Unix and Linux
  • .pl scripts for Unix and Linux
Note
Scripts on Linux and Unix systems need to have the x-bit set to be detected.
Connection properties and environment variables can be added to a script.
To execute a script as an operation:
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Operations tab.
  4. In the Schedules tab, click the New button.
  5. Select Execute CLI script as the operation type from the Operation drop-down menu.
    Note
    The Execute script option is only available for JBoss AS and JBoss AS 5 resources, by default, and only if a script is available to execute.
  6. Enter any command-line arguments in the Parameters text box.
    Each new argument has the format name=value; and is added on a new line. If the variable's value contains properties with the syntax %propertyName%, then JBoss ON interprets the value as the current values of the corresponding properties from the script's parent resource's connection properties.
  7. Finish configuring the operation, as in Section 11.2.1, “Scheduling Operations”.

11.2.6. Setting an Operation Timeout Default

Only one operation can run on a resource at one time. An optional timeout setting prevents an operation from hanging indefinitely and blocking other operations from running. A global default timeout can be set in the JBoss ON server configuration to prevent operations from being blocked on a resource, even if a timeout period isn't set on a specific operation.
Note
This server setting is a fallback value. Operation plug-ins can define their own timeouts in the plug-in descriptor or individual operations can specify a timeout. Both of those settings override the server default.
  1. Open the rhq-server.properties file.
    vim serverRoot/jon-server-3.3.0.GA/bin/rhq-server.properties
  2. Change or add the value of the rhq.server.operation-timeout parameter to the amount of time, in seconds, for the server to wait before an operation times out.
    rhq.server.operation-timeout=60

11.2.7. Operation History Report

Every resource tracks its own individual operations history, as in Section 11.2.2, “Viewing the Operation History”.
JBoss ON also keeps a master list of all operations, for all resources. This is displayed in the Operation History Report, in the Reports tab.
As with the resource-level operation history, the Operation History shows the operation name, the date it was submitted, and its status. Because all resources are listed, the Operation History Report also shows the resource name and its parent (and grandparent) to help disambiguate on which resource the operation ran.

Figure 11.2. Operation History Report

Operation History Report
The Operation History Report can be filtered (not just sorted) by two criteria: the operation status and the date or date range that the operation was submitted.
Note
Reports can be exported to CSV, which can be used for office systems or further data manipulation.
Only the information displayed for the report is exported. If the Operation History Report is filtered by date or status, only the matching operations are included in the report.
To export a report, simply click the Export button. The report will automatically be downloaded as configurationHistory.csv.


[2] If there are no scheduled operations, then the tab has a description that reads No items to show. That does not mean that there are no operations available for the resource; it only means that no operations have been scheduled.

Chapter 12. Summary: Using JBoss ON to Make Changes in Resource Configuration

One of the most basic parts of managing your applications, servers, and services is the simple ability to change their configuration.
JBoss Operations Network allows you to view the current configuration for many resource types directly in the JBoss ON UI, without having to access the platform's filesystem directly. Even more, JBoss ON allows you to edit the configuration directly for a single resource or for an entire group of compatible resources.
JBoss ON has three key ways that administrators can manage resource configuration:
  • Directly edit resource configuration. JBoss ON can edit the configuration files of a variety of different managed resources through the JBoss ON UI.
  • Audit and revert resource configuration changes. For the specific configuration files that JBoss ON manages for supported resources, you can view individual changes to the configuration properties and revert them to any previous version.
  • Define and monitor configuration drift. System configuration is a much more holistic entity than specific configuration properties in specific configuration files. Multiple files for an application or even an entire platform work together to create an optimum configuration. Drift is the (natural and inevitable) deviation from that optimal configuration. Drift management allows you to define what the baseline, desired configuration is and then tracks all changes from that baseline.
This section has a very general overview of these three ways of managing resource configuration. More detailed descriptions and procedures are in the subsequent sections.

12.1. Easy, Structured Configuration

Basic configuration files use simple key-value pairs to define information.
key1 = value1
key2 value2
These are simple properties, representing strings, numbers, or booleans — any type of information where there is one value per key.
JBoss ON also supports resource configuration using complex properties, which may be a list of values or a map of values (a table of lists).
<default-configuration>
    <ci:list-property name="my-list">
        <c:simple-property name="element" type="string"/>
        <ci:values>
           <ci:simple-value value="a"/>
           <ci:simple-value value="b"/>
           <ci:simple-value value="c"/>
        </ci:values>
    </ci:list-property>
</default-configuration>
JBoss ON parses the configuration files — both simple properties and complex properties — and then renders a structured, easy-to-follow form in the JBoss ON GUI. Simple properties are displayed with fields or radio buttons as appropriate, while complex properties are displayed with menus or other selection options.

Figure 12.1. Configuration Form for a Samba Server

Configuration Form for a Samba Server
The structured configuration form makes it easy for you to view the current configuration quickly.
The structured form also makes it possible for JBoss ON to validate that the configuration properties have valid formats before saving changes.
Note
JBoss ON only validates that the given value matches the required format for that property. It does not validate that the value given is reasonable or allowed for that resource property.
Performing configuration changes in JBoss ON has major benefits for IT administrators:
  • There is instant validation on the format of properties that are set through the UI.
  • Audit trails for all configuration changes can be viewed in the resource history for both external and JBoss ON-initiated configuration changes.
  • Configuration changes can be reverted to a previous stable state if an error occurs.
  • Configuration changes can be made to groups of resource of the same type, so multiple resources (even on different machines) can be changed simultaneously.
  • Alerts can be used in conjunction with configuration changes, either to send automatic announcements of any configuration changes or to initiate operations or scripts on related resources as configuration changes are made.
  • Access control rules are in effect for configuration changes, so JBoss ON users can be prevented from viewing or initiating changes on certain resources.

12.2. Identifying What Configuration Properties Can Be Changed

JBoss ON supports configuration change for an extensive array of resources — including hosts and sudoers files, Samba servers, Postfix servers, databases, web app contexts, cron tabs, web servers, and scripts.
Any resource which supports configuration changes through JBoss ON has a Configuration tab on its resource page.

Figure 12.2. Configuration Tab

Configuration Tab
For a complete reference of the configuration properties for each resource, see the Resource Reference: Monitoring, Operation, and Configuration Options.

12.3. Auditing and Reverting Resource Configuration Changes

Tracking for configuration changes is a crucial part of systems administration. It's important for maintenance, for performance, and for incident recovery — particularly when it is possible to revert change or correlate changes to incidents.
Every time a change is made to the resource configuration, whether through JBoss ON GUI or on the resource itself, the change is detected by JBoss ON and logged with a revision number. When a change is made outside of JBoss ON, the change is simply noted. When the change is made through the JBoss ON UI, the timestamp and the name of the user who made the change are both recorded.
Every change is recorded in a history, and the different changesets can be viewed and compared to one another. One change can be selected and the resource configuration can be rolled back to that selected change.
Tracking the configuration history and reverting changes is covered more in Section 14.1, “Tracking and Comparing Configuration Changes” and Section 14.2, “Reverting Configuration Changes”.

12.4. Tracking Configuration Drift

Much of the JBoss ON configuration management is designed around implementing changes for resources by editing configuration files or updating files and packages. But another aspect of managing configuration is detecting changes.
IT administrators must invest a significant amount of time planning the optimum configuration for systems in every type of environment, from production systems to internal resources. This ideal configuration includes file settings, software versions, and system settings. Resource configuration is going to change naturally over time, but administrators need to be able to track those changes to make sure that no unplanned or undesirable changes impact the resource. Defining a baseline configuration and tracking changes helps systems remain resilient during both maintenance and failures.
The unplanned changes that occur to a resource's configuration is called drift, as the configuration moves away from the designed baseline. Drift is common because of frequent software and hardware updates, particularly in a colocation facility or using virtual machines.
Production, staging, development, and recovery configurations are designed to have identical or near-identical configuration to maintain consistency. As the configurations within the different environments change, there emerges a configuration gap. Ultimately, this configuration gap can lead to disaster recovery failures or high availability failures because the configuration of the production system and the backup system are too different.
Drift monitoring provides a very general, freewheeling content monitoring. Rather than structured configuration management, drift monitoring tracks changes, any changes, in files — even binary files.
Configuration History v. Configuration Drift
The configuration history for a resource applies only to the supported configuration properties for that specific resource instance.
Drift management has a much more external view of configuration changes. Drift is associated with a resource — like a platform or a JBoss server — but it is not restricted to that resource or to set properties for that resource:
  • Drift looks at whole files within a directory, including added and deleted files and binary files.
  • Drift supports user-defined templates which can be applied to any resource which supports drift monitoring.
  • Drift can keep a running history of changes where each changeset (snapshot) is compared against the previous set of changes. Alternatively, JBoss ON can compare each change against a defined baseline snapshot.
The drift definition that is essentially a profile that identifies a directory and files that should be monitored. Any time there is any change in that drift base directory or any of its subdirectories — file modifications, new files, or deleted files — the drift detection scan notices the change and records it.
Drift detection can be used by administrators to track scheduled changes, maintenance and updates, and server changes. There are a lot of common scenarios where administrators need to be aware that change has occurred (and even be able to identify the specific changes made), but that occur in areas outside normal JBoss ON configuration tracking:
  • System password changes
  • System ACL changes
  • Database and server URL changes
  • JBoss settings changes
  • Changed JAR, WAR, and other binary files used by applications
  • Script changes
Note
Drift is not bound or restricted to a resource managed by JBoss ON. You can create a drift definition for a platform and set it to monitor any file or directory on that platform, even if it is outside the JBoss ON inventory, as long as that directory is accessible to the system user that the JBoss ON agent runs as.
Managing configuration drift is described more in Chapter 15, Managing Configuration Drift.

Chapter 13. Changing the Configuration for a Resource

13.1. Changing the Configuration on a Single Resource

  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Open the Configuration tab for the resource.
  4. Click the Current subtab.
  5. To edit a field, make sure the Unset checkbox is not selected. The Unset checkbox means that JBoss ON won't submit any values for that resource and any values are taken from the resource itself.
    Then, make any changes to the configuration.
    The list of available configuration properties, and their descriptions, are listed for each resource type in the Resource Reference: Monitoring, Operation, and Configuration Options.
  6. Click the Save button at the top of the properties list.

13.2. Changing the Configuration for a Compatible Group

Similar to other templating functions in JBoss ON, like alert templates, configuration changes can be made on a compatible or autogroup, so that all of the members of that group can be up updated simultaneously with the same settings.
Note
To change the current configuration for a group, a few conditions must be true so that the current group configuration can be reliably calculated for the individual resource configurations:
  • The group members must all be the same resource type.
  • All group member resources must be available (UP).
  • No other configuration update requests can be in progress for the group or any of its member resources.
  • The current member configurations must be successfully retrieved from the agents.
The process for setting the configuration for a group is the same as setting it for an individual resource:
  1. Click the Inventory tab in the top menu.
  2. In the Groups box in the left menu, select the Compatible Group link.
  3. Select the group to edit.
  4. Open the Configuration tab.
  5. Click the Current subtab.
  6. To edit a field, make sure the Unset checkbox is not selected. The Unset checkbox means that JBoss ON won't submit any values for that resource and any values are taken from the resource itself.
    Then, make any changes to the configuration.
    The list of available configuration properties, and their descriptions, are listed for each resource type in the Resource Reference: Monitoring, Operation, and Configuration Options.
    Note
    It is possible to change the configuration for all members by editing the form directly, but it is also possible to change the configuration for a subset of the group members. Click the green pencil icon, and then change the configuration settings for the members individually.
  7. Click the Save button at the top of the form.

13.3. Editing Script Environment Variables

Scripts are autodetected on a server, as are other applications and services on the machine. Scripts can be configured and managed like any other resource, which means that JBoss ON allows you to both define configuration settings for and set up operations to run the scripts in inventory.
Whether a script is added or detected, there are only two configuration areas for the inventory entry: the path to the script, which places the script within the hierarchy, and any environment variables that should be set with the script.
These environment variables can be added and edited even after the script is imported:
Important
Before setting environment variables in the JBoss ON configuration, make sure that the environment on the resource is already configured properly.
  1. Click the Inventory tab in the top menu.
  2. Search for the script resource.
  3. Open the Configuration tab for the script resource.
  4. Click the plus sign (+) to add an environment variable.
  5. Enter the environment variable. Each new environment variable has the format name=value; and is added on a new line.
    If the variable's value contains properties with the syntax %propertyName%, then JBoss ON interprets the value as the current values of the corresponding properties from the script's parent resource's connection properties.
  6. After resetting an environment variable, restart the JBoss ON agent to propagate the changes. If the agent isn't restarted, new variables will not be propagated to the resource and will not resolve when the script is next executed, even if the configuration is correct.
Note
Add the line @echo off in Windows scripts to prevent echoing the executed commands along with the execution results.

13.4. Configuring Apache for Configuration Management (Deprecated)

JBoss ON manages configuration on Apache resources using an Augeas lens. A special version of Augeas is included with the JBoss ON agent which enables Apache configuration management. That Augeas lens must be enabled on the Apache resource for configuration management to work.

13.4.1. Considerations and Notes for Apache Configuration Management

Deprecated Augeas Plug-in

Apache configuration management is supported through a special Augeas agent plug-in, which connects with and manages the Augeas lens on the Linux instance. The Augeas agent plug-in is deprecated in JBoss ON 3.1.1 and may be removed in a later release.

Augeas and Apache Monitoring

The Augeas lens is not required for Apache monitoring. It is only used for Apache configuration management. An Apache resource can be monitored, with alerting, operations, and all other management tasks available without any additional configuration. The Augeas lens is used only for editing the Apache configuration files and virtual hosts through JBoss ON.

Supported Platforms for Apache Configuration

Apache configuration management is only supported for Apache instances installed on Linux.

Disabling noexec Options for Apache Directories

If the /tmp directory is configured a noexec in the fstab file, the agent throws exceptions because it cannot properly initialize the Augeas lens. In that case, the Configuration tab is unavailable for the Apache resource.

Make sure that the /tmp directory does not have noexec set as an option.
#
# /etc/fstab
#
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0

13.4.2. Enabling Configuration Management

Apache configuration management is configured as one of the connection settings for the Apache resource, which sets how the agent connects to the resource.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then search for the Apache resource.
  3. Click the IP address of the Apache instance.
  4. Open the Inventory tab, then click the Connections subtab.
  5. Jump to the Augeas Configuration section.
  6. Select the Yes radio button to enable the Augeas lens.

Chapter 14. Tracking Resource Configuration Changes

The revision numbers are global number across the JBoss ON server. For example, if Resource A is edited, then it gets revision #1. Then, when Resource B is edited, it gets revision #2, and the next edit gets #3.
Note
A user may have the right to edit or revert configuration, but that does not mean that the user has the right to delete an item from the configuration history.
Deleting elements in the history requires the manage inventory permission.

14.1. Tracking and Comparing Configuration Changes

  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Open the Configuration tab for the resource.
  4. Click the History subtab.
  5. Select the line of the configuration version to view or compare. Use the Ctrl key to select multiple versions. The current (most recent successful) configuration state is marked by a green check mark.
  6. Click the Compare button.
  7. The pop-up window shows all of the changes in a directory-style layout, with each of the configuration areas as a high-level directory. Any changes are marked in red, and the values are shown for each selected version.

14.2. Reverting Configuration Changes

  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Open the Configuration tab for the resource.
  4. Click the History subtab.
  5. Select the line of the configuration version to roll back to. The current (most recent successful) configuration state is marked by a green check mark.
  6. Click the Rollback button.

14.3. Viewing the Configuration History Report

Every resource (that supports configuration) tracks its own individual configuration change history, as in Section 14.1, “Tracking and Comparing Configuration Changes”.
JBoss ON also keeps a master list of all configuration changes, for all resources. This is displayed in the Configuration History Report, in the Reports tab.
As with the resource-level configuration history, the Configuration History shows the version number of the change, the time the configuration change was requested and completed, its status, and the requesting user. Because all resources are listed, the Configuration History Report also shows the resource name and its parent (and grandparent) to help disambiguate on which resource the change occurred.

Figure 14.1. Configuration History Report

Configuration History Report
The Configuration History Report supports compare operations, as with the resource-level configuration history. This is useful because you can compare not only versions of configuration for the same resource, but also for the same configuration property in different resources (of the same type). This helps administrators figure out where their infrastructures are and to pinpoint changes or diversions between the configuration for similar resources.
Note
Reports can be exported to CSV, which can be used for office systems or further data manipulation.
To export a report, simply click the Export button. The report will automatically be downloaded as configurationHistory.csv.

Chapter 15. Managing Configuration Drift

Much of the JBoss ON configuration management is designed around implementing changes for resources by editing configuration files or updating files and packages. But another aspect of managing configuration is detecting changes.
IT administrators must invest a significant amount of time planning the optimum configuration for systems in every type of environment, from production systems to internal resources. This ideal configuration includes file settings, software versions, and system settings. Resource configuration is going to change naturally over time, but administrators need to be able to track those changes to make sure that no unplanned or undesirable changes impact the resource. Defining a baseline configuration and tracking changes helps systems remain resilient during both maintenance and failures.
The unplanned changes that occur to a resource's configuration is called drift, as the configuration moves away from the designed baseline. Drift is common because of frequent software and hardware updates, particularly in a colocation facility or using virtual machines.
Production, staging, development, and recovery configurations are designed to have identical or near-identical configuration to maintain consistency. As the configurations within the different environments change, there emerges a configuration gap. Ultimately, this configuration gap can lead to disaster recovery failures or high availability failures because the configuration of the production system and the backup system are too different.
Drift monitoring provides a very general, freewheeling content-based monitoring. Rather than structured configuration management, drift monitoring tracks changes to content on the local filesystem. This means any changes in any files — even binary files [3].

15.1. Understanding Drift

Of course, drift monitoring isn't as simple as checking for changes. One of the core questions is what changes matter? There are two conceptual parts to that question:
  • What directories (and files within those directories) matter for drift monitoring? Even though a drift definition is defined for a resource, the actual drift detection is performed at the directory level. Drift monitoring, then, can hit anywhere on a platform — even outside resources managed by JBoss ON.
  • How do you identify a change? Do you compare it to the version immediately before it or to an established baseline?
Once you identify what changes you want to monitor for drift, then you can use JBoss ON to set up monitoring and alerting effectively.

15.1.1. Drift Definitions and Detection

The first part of drift detection is identifying what you are monitoring.
JBoss ON defines a drift definition that sets the target location for drift monitoring. The target can be identified from some configuration element for the resource — it can be a directory or file on the filesystem, a resource configuration property, a resource plug-in parameter, or a monitoring trait. This target is the base directory. There are several different configuration areas within each resource that (potentially) define a filesystem location. The base directory is identified by determing what configuration type contains that information; that information area is the value context. A value context can be any one of four different areas:
  • From the plug-in configuration (pluginConfiguration). This means, it can be taken from any of the connection properties for the resource. Connection properties can include log files, deployment directories, and installation directories, depending on the resource type.
  • From the resource configuration (resourceConfiguration). This means, it can be taken from any of the configurable properties for the resource.
  • From a trait (measurementTrait). Traits are informational measurement properties for the resource.
  • An explicit filesystem location. If none of the resource properties have the proper location or if a different location should be used for drift, then the directory can be specified in the fileSystem property.
The actual value is the value name.
Note
Plug-in configuration (connection properties), resource configuration properties, and traits for each resource are listed in the Resource Reference.
For example, for a base directory of /etc/ that only includes changes to *.conf files, the elements in the drift definition are:
Value context: fileSystem
Value name: /etc
Includes: **/*.conf
Note
Drift detection is performed at the filesystem level. This means that drift detection is not bound or restricted to a resource managed by JBoss ON. You can create a drift definition for a platform and set it to monitor any file or directory on that platform, even if it is outside the JBoss ON inventory, as long as that directory is accessible to the system user that the JBoss ON agent runs as.
By default, every subdirectory and file underneath the base directory is monitored for drift. The includes/excludes options define subdirectories or files that are explicitly included or explicitly excluded from drift monitoring. If includes is used, then only the specified directories or files are monitored and everything else is implicitly excluded, and vice versa for excludes. Included/excluded directories and files are identified by a path and a pattern. The path is the starting point beneath the base directory, and the pattern matches the file to monitor.

Table 15.1. Combinations to Include Specific Files

Files to Monitor for Drift 'Includes' Path 'Includes' Pattern
/etc and all its subdirectories Blank Blank
For *.conf files in /etc and all subdirectories . **/*.conf [a]
For *.conf files only in the /etc directory, with no subdirectories (/etc/*.conf) . *.conf
For *.conf files only in a subdirectory one level below /etc (/etc/*/*.conf) Not possible Not possible
For any file in a specific subdirectory (yum.repos.d/) below /etc yum.repos.d (subdirectory name) Blank
[a] This must have a double asterisk for the directory part. It will not work with a single asterisk.
The drift definition also sets an interval, or frequency, for how often the agent checks for drift. This is a very important setting for performance, both for JBoss ON and for data management. Setting a frequency that is too high risks missing changes or lumping changes together into large (and therefore difficult to manage) snapshots. However, setting the interval too low can impact JBoss ON agent and server performance.
The key thing about the drift definition is that it sets what to look at and how often to look.
Note
All drift detection runs are performed outside the agent plug-in and independent of the resource state. A drift detection scan can be run even if the resource is not running.

15.1.2. Snapshots, Deltas, and Baseline Images

The second part of drift detection is identifying how you want to define a change. Change is comparative. It takes the current version of a file and compares it to some previous version. The question for drift management is what previous version to compare to.
When a drift definition is first created, the agent collects all of the files in that base directory and subdirectories and sends information about them to the server. This collection is the initial file set.
From that point onward, the agent only sends change information about the files. Each set of changes is a snapshot. For text files, the change information includes the content of the file and diffs (both constructed on the JBoss ON server based on patches sent by the agent). For binary files, drift only records that the file change and displays a SHA and timestamp. A snapshot is always based on real files from a real resource.
Note
The agent does not send the actual files to the JBoss ON server. The agent sends information about file changes back to the server. These updates only contain the deltas between versions; they're not full files. This minimizes the network I/O.
The actual diffs are generated on the server from the content that the server stores.
The way that a snapshot is created is by comparing the current files against the agent's version. There are two ways that this comparison can be made:
  • It can compare against the next-most recent version of the files.
  • It can compare against a defined, stable baseline.
The first option is a rolling snapshot. This is the simplest setting, because it just keeps a running tally of changes.

Figure 15.1. Rolling Snapshots

Rolling Snapshots
The second option is a pinned snapshot, and this is the method that gives administrators the most insight and control over drift. A pinned snapshot means that some image of the base directory has the optimal, approved configuration and this snapshot is selected as the baseline. It is in essence pinned in place, and every subsequent change is compared against that pinned snapshot, rather than being compared against each other.

Figure 15.2. Pinned Snapshots

Pinned Snapshots
Snapshots exist at the resource-level, because they are based on the real files that exist on a system. When a snapshot is pinned to a resource-level definition, then any changes made on that system are compared to that snapshot. When the current file version matches the pinned snapshot, the resource is compliant.
A snapshot for a resource can also be pinned to a drift template — then, it is applied to every definition attached to that template. This is really powerful for administrators. For example, you can use a staging or development server to create the best system configuration for EAP performance, and then apply that EAP baseline snapshot to every EAP server in the production environment by pinning it to a drift template. You can see immediately all the EAP configuration relative to your defined ideal.

15.1.3. Destination Directories with Special File Types

Drift looks at both files and directories on the local system to generate snapshots and identify changes. The majority of these files and directories are going to be real files, but Unix does have some special file types, and the drift operation may encounter those files as part of processing the destination directory. There are some behavior implications, particularly with symbolic links and named pipes.
With symbolic links, drift detection follows any links back to the original file or directory, and includes those files in the snapshot. For example, if a symlink is set up for some library:
ln -ls /home/dev/libs /usr/share/jbossas/server/libs
If drift is configured on the libs/ directory in the JBoss Enterprise Application Platform home directory, it will follow the symlink back to /home/dev/libs, and include all of those files in the drift snapshot.
Important
Be careful when configuring drift against a directory which contains symlinks. All of the linked files will be included as part of the drift target.
If the linked directory has a large number of files, then drift detection runs may take longer than expected. Additionally, changes in that symlinked directory may have an unexpected impact on drift detection by recording many changes as drift when they weren't intended to be.
If you do not want to include the symlinked directory in the drift definition, use the excludes parameter in the drift definition to exclude the symlink.
The other special file type that is common on Unix systems is a pipe. As with a symlink, drift detection runs can detect a fifo file within the target directory. However, unlike symlinks, drift cannot process the fifo file, which causes drift detection to hang.
Note
Use the excludes parameter in the drift definition to exclude any named pipes in the target directory.

Table 15.2. Drift Definitions and Unix File Types

File Type Supported by Drift?
File Yes
Directory Yes
Symbolic link Yes
Pipe No
Socket No
Device No

15.1.4. Drift and Resource Types

Whether drift is supported is defined in the resource type (which is discussed in Section 15.3.1, “About Resources and Drift Definition Templates”). If a drift template is defined in the resource type's rhq-plugin.xml descriptor, then that resource type supports drift. The template is a starting point (not an enforced configuration, like alert or metric collection templates).
Three JBoss ON standard resource types support drift:
  • All platforms
  • JBoss EAP 6 (AS 7), and all resources which use the JBoss AS 5 plug-in
  • JBoss AS/EAP 5, and all resources which use the JBoss AS 5 plug-in
  • JBoss AS/EAP 4 (deprecated)
Because drift support is defined in the plug-in descriptor, custom plug-ins can be created that add drift support for those resource types. For examples of writing agent plug-ins with drift support, see "Writing Custom JBoss ON Plug-ins."
Note
Drift is not supported on embedded web applications, such as an embedded WAR under an EAR application.
Drift detection is performed at the directory level. It is not tied to a specific resource. This means that drift detection can be run even when a resource is not running. It also means that drift detection can occur for an application, service, or file that is not managed by JBoss ON or for a resource type that does not otherwise support drift.
To monitor an entity outside the three supported resources, just configure drift detection on the platform resource and define the base directory as whatever directory path is used by the application or service you want to monitor.

15.1.5. Space Considerations for Drift Monitoring

Configuring drift monitoring can have a significant impact on disk space requirements.
JBoss ON stores multiple snapshots. This is part of versioning control, allowing changes to monitored directories to be reverted and managed.
Therefore, the system which hosts the backend database (Oracle or PostgreSQL) must have enough disk space to store all versions of all bundles. Additionally, the database itself must have adequate tablespace for the content.
Size considerations can also affect how drift monitoring is configured, in several ways:
  • The size of the directory being monitored. In some cases, it may be better to monitor multiple smaller subdirectories rather than one large, high-level directory.
  • The frequency of drift detection runs, balancing the need to capture changes versus the number of backup copies.
  • How long drift snapshots are stored. By default, unused snapshots (meaning, unpinned snapshots) are stored for 31 days and then deleted. Changing how long snapshots are stored can help manage the database size.
When calculating the required amount of space, estimate the size the targeted directories, and then the frequency that snapshots are taken to get an idea of how many snapshots will be stored. At a minimum, have twice that amount of space available; both PostgreSQL and Oracle require twice the database size to perform cleanup operations like vacuum, compression, and backup and recovery.

15.1.6. Back to Drift Monitoring

The goal of setting up drift detection is to provide clarity into how systems and application servers are being modified. JBoss ON provides two ways to manage drift.

Drift Monitoring

Drift monitoring is the ability to track changes to target locations. The JBoss ON GUI allows you to view snapshots all together, compare changes for individual files between snapshots, view the current configuration, and view change details. It also provides inventory and drift reports and indicates, at a glance, whether a resource is compliant with an associated pinned snapshot.

Drift Alerting

A specific alert condition exists that will trigger an alert whenever there is drift. For rolling snapshots, this will send an alert once (and only once) for each drift snapshot. For pinned snapshots, the drift alert is fired for every detection run for as long as the resource is out of compliance, even if there are no subsequent changes.

Note
There is no direct way to remedy drift through the JBoss ON GUI. However, it is possible to launch a JBoss ON CLI script in response to a drift alert. For example, you can create a patch of your ideal EAP configuration. If an EAP server drifts from that configuration, then you can use a JBoss ON CLI script to deploy that EAP patch bundle to the drifting EAP server.

15.2. Adding a Drift Definition for a Resource

Important
The directories where drift detection is being run cannot be changed after the definition is created. Be careful to get the base directory and the included and excluded files properly configured before saving.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Open the Drift tab for the resource.
  4. Click the New at the bottom to add a new definition.
  5. Select the template to use to as the basis for the new definition.
    Plug-in defined templates are defined in the platform and JBoss server resources, as well as any other resource which supports drift monitoring. Additional, user-defined templates can be also be created and applied.
  6. Give a unique name to the definition. The name and the base directory are combined to identify the definition within JBoss ON.
  7. Define the settings for the definition, like the interval and whether it is associated with the template. The properties are listed in Table 15.3, “Drift Definition Properties”.
  8. Set the base directory. This is the top-most directory where drift detection is run for the definition, and the scan recourses down.
    The template itself defines an initial directory, but it may be useful to set a more specific directory to use.
  9. Click the button with the green plus (+) sign to add a subdirectory to include or exclude. The directory can be the base directory by specifying a period (.) as the directory. The pattern identifies which files within the directory to recognize by the service, either to explicitly include or explicitly exclude.
    The filters support Ant-like FilePatterns, using a path and pattern. The patterns support asterisks (*) as wildcards for any number of characters and question marks (?) for single character wild cards. For example, **/*.conf can be used to include only .conf files in any subdirectory.
    There can be multiple include/exclude filters. Each directory and pattern can be added separately.

Table 15.3. Drift Definition Properties

Property Description
Name A name for the drift detection definition. The name and the base directory, together, uniquely identify the definition.
Base Directory: Value Context
The type of configuration property which is used to identify the base directory. This identifies what type of element in the resource supplies the value. There are four options:
  • File system, which is simply an absolute directory path on the resource. This directory must exist for drift to work.
  • Resource configuration, which is a configuration property defined for the resource.
  • Trait, which is one of the monitored traits for that resource.
  • Plug-in configuration property, which is a property defined in the resource plug-in.
Base Directory: Value Name The actual value for the drift detection definition to use for the base directory context. For example, if this is a file system context, then the value name is the directory path.
Includes
Explicitly includes directories, files, or files and directories matching a pattern, relative to the base directory, in the drift detection.
The filters support Ant-like FilePatterns, using a path and pattern. The patterns support asterisks (*) as wildcards for any number of characters and question marks (?) for single character wild cards.
If a pattern is used, then a path must be specified, even if the path is the base directory. For example, to include only .conf files in the base directory, the pattern is *.conf and the path is a period (.) to indicate the local directory.
Excludes
Explicitly excludes directories, files, or files and directories matching a pattern, relative to the base directory, from the drift detection.
The filters support Ant-like FilePatterns, using a path and pattern. The patterns support asterisks (*) as wildcards for any number of characters and question marks (?) for single character wild cards.
If a pattern is used, then a path must be specified, even if the path is the base directory. For example, to include only .conf files in the base directory, the pattern is *.conf and the path is a period (.) to indicate the local directory.
Enabled Enables or disables the definition. Disabling a definition means that no detection scans are run.
Interval Sets the frequency, in seconds, where the definition is eligible for a detection run. This is not a hard setting. Because load or other scheduled operations for the agent, the detection run is not guaranteed to run at the specified interval.
Pinned
Sets whether drift is determined in a rolling way or if it is associated (pinned) with a baseline snapshot. If this is set when the definition is created, then the initial snapshot is used as the baseline.
Definitions attached to a pinned template cannot be unpinned. Definitions which are attached to an unpinned template or which are not attached to a template can be pinned or unpinned freely.
Drift Handling Mode Sets whether drift changes are treated as events which trigger an alert (the default) or as expected, so that no alerts are triggered.
Attached to Template
Sets whether the resource-level definition is subordinate to a template. If it is attached to a template, then any changes to the template are reflected in the resource definition, including if the template is deleted.
By default, definitions are attached to the template from which they are created.
Description A simple text description of the definition.

15.3. Creating a Drift Definition Template

Every time a new drift definition is created, it is based on an existing template for the resource type. At least one template is defined for resource types (by default, platforms and JBoss application servers) in their resource plug-in. Additional templates can be created by users.

15.3.1. About Resources and Drift Definition Templates

Resources of the same type frequently need to have the same, or similar, configuration settings. Particularly for an area like configuration drift, consistency is crucial for accurate and timely IT maintenance. JBoss ON allows this consistency using drift definition templates. Much like alert and monitoring templates, drift definition templates are defined for a resource type (regardless of whether any resources of that type actually exist) and can then be applied to specific resources in the inventory.
Drift templates are a little different than other template types in JBoss ON. First, a drift definition template is exactly that — it is an outline of default settings and values to use when creating a resource-level drift definition. It is not automatically applied to resources.
Additionally, there are two types of drift templates: plug-in defined templates and user-defined templates.
At least one drift definition template is actually defined as part of the plug-in for a resource type. Defining a template in the plug-in descriptor is what indicates that a resource type supports drift monitoring. These are plug-in defined templates; these are the default templates.
Having a plug-in defined template is the way that the JBoss ON agent recognizes that a particular resource type supports drift monitoring. So, the plug-in defined template has a dual purpose. It lets JBoss ON know what resource types support drift, and it gives basic input to help administrators start making their own drift definitions.

Example 15.1. A JBoss Server Drift Definition Template

<drift-definition name="Template-Base Files"
                  description="Monitor base application server files for drift. It defines monitoring for some standard sub-directories of the HOME directory.  Note, it is not recommeded to monitor all files for an application server. There are many files, and many temp files.">
    <basedir>
         <value-context>pluginConfiguration</value-context>
         <value-name>homeDir</value-name>
    </basedir>
    <includes>
 <include path="bin" />
        <include path="lib" />
        <include path="client" />
    </includes>
</drift-definition>
New drift definition templates can be added by administrators, in addition to the plug-in defined template; these are user-defined templates. These templates can reflect the unique infrastructure and application environment.
A resource-level drift definition is always based on a drift template, which provides some default values to the definition during creation. That template can be plug-in defined or user-defined. Resource-level drift definitions do not have to be attached to a template, so they do not have to be changed every time the template changes, but they are always based on an existing template.
There are some things to remember about drift definition templates:
  • Drift templates are not automatically applied to a resource, unlike other template types in JBoss ON. Drift templates are used as the basis for creating resource-level definitions.
  • Default drift templates are defined for resources as part of their plug-in descriptor. Custom, user-defined templates can be added along with those defaults.
  • Every drift definition is based on a template initially, even if that definition is not attached to that template post-creation.
  • Snapshots (the file sets associated with drift definitions) always originate on a resource with a drift definition first. For any content to be associate with a template, the resource-level snapshot has to be promoted up to the template. Drift templates do not generate snapshots or files and then push that down to the resource.

15.3.2. Creating a Drift Definition Template

A drift template creation form is almost identical to a resource-level drift definition, with two exceptions: it cannot be pinned to a snapshot at the time it is created and it cannot be associated with another template. Obviously, a template is not dependent on another template (even though it is created from another template.) Being unable to pin a template to a snapshot is also logical; when a template is created, it is not associated with any resources. So, it is not possible to generate snapshots, which means that there is nothing to pin the template to.
  1. Click the Administration tab in the top menu.
  2. Select the Drift Definition Templates menu table on the left.
  3. Click the pencil icon for the resource type to add the template to. Not all resources support drift, so they cannot be selected.
  4. Click the New at the bottom to add a new template.
  5. Select the template to use to as the basis for the new template.
    Plug-in defined templates are defined in the platform and JBoss server resources, as well as any other resource which supports drift monitoring. Additional, user-defined templates can be also be created and applied.
  6. Give a unique name to the template. The name and the base directory are combined to identify the definition within JBoss ON.
  7. Define the settings for the definition, like the interval and whether it is enabled by default. The properties are listed in the table in Section 15.2, “Adding a Drift Definition for a Resource”.
  8. Set the base directory. This is the top-most directory where drift detection is run for the definition, and the scan recourses down.
  9. Click the button with the green plus (+) sign to add a subdirectory to include or exclude. The directory can be the base directory by specifying a period (.) as the directory. The pattern identifies which files within the directory to recognize by the service, either to explicitly include or explicitly exclude.
    The filters support Ant-like FilePatterns, using a path and pattern. The patterns support asterisks (*) as wildcards for any number of characters and question marks (?) for single character wild cards. For example, **/*.conf can be used to include only .conf files in any subdirectory.

15.4. Editing Drift Definitions

Most entries in JBoss ON are edited by clicking their name or double-clicking their row in a list. However, for drift definitions, clicking the name or double-clicking the row opens up the list of snapshots for that definition — not the definition entry itself.
To edit a drift detection definition, click the pencil icon.

15.5. Viewing Snapshots and Changes

Note
The initial snapshot is snapshot 0. The snapshots in the carousel begin at version 1 — meaning it begins at the first change, not the initial file set.
If a snapshot is pinned, so that it is set as a baseline, then it is not displayed in the carousel because it is snapshot 0. However, it can be viewed by clicking the pinned icon in the definition list.

15.5.2. Comparing Drift Changes

Changes are diffed at the file level, not the full snapshot level. Administrators can view the specific changes made between versions on the selected files.
Note
Only changes for text files can be compared. Drift detection will identify binary files that have changed and show a timestamp and SHA, but it does not display the binary file contents or diff changes between versions of a binary file.
  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
  3. Click the Drift tab for the resource.
  4. Click the name of the drift definition.
  5. Click the names of the files to compare.
  6. Click Compare.
The diff uses standard text formatting for displaying file diffs.

Figure 15.4. Change Set Diffs

Change Set Diffs

15.5.3. Viewing Snapshot Details

  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
  3. Click the Drift tab for the resource.
  4. Click the name of the drift definition.
  5. In the snapshot carousel, click the magnifying glass by the name of the snapshot to view.
  6. Expand the directory to show the list of changes for that snapshot.
  7. To see the details of a specific change, click the (view) link.
  8. The details for that file shows links to display the immediate previous version of the file, the changed version of the file, and a diff between the two.
    When clicking the view link, the page title has the version number along with the file name. For example, when viewing version 6 of myfile.txt, the title is myfile.txt:6.

15.5.4. Seeing Drift Events in the Timeline

Whenever drift is detected, it shows up as an event in the events timeline for the resource.
  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
  3. In the Summary tab, click the Timeline subtab.
  4. The detection runs where drift was detected show up in the timeline as Drift Detected. To see only drift events in the timeline, clear all but the Drift checkbox.
    The time interval can be reset to adjust the span of the timeline.

15.5.5. Checking Drift Snapshot Reports

The snapshot carousel (Section 15.5.1, “Viewing the Snapshot Carousel”) shows all of the snapshots for a single drift definition on a single resource. To view a list of all snapshots, for all definitions across all resources, check the Recent Drift Report.
  1. Click the Reports tab in the top navigation menu.
  2. Select the Recent Drift report from the Subsystems report list.
  3. Every drift instance is listed, sorted by the snapshot creation time.
  4. Optionally, filter the list of drift changes. There are four filter options:
    • The definition name
    • The snapshot number (which crosses drift definitions)
    • The change type within the snapshot, whether a file was added, deleted, or modified.
    • The path of a change within the snapshot. This path can be a directory, a specific file name, or a search expression.
Note
Reports can be exported to CSV, which can be used for office systems or further data manipulation.
Only the information displayed for the report is exported. If the Recent Drift Report is filtered by date, definition, snapshot or version, or category, only the matching operations are included in the report.
To export a report, simply click the Export button. The report will automatically be downloaded as recentDrift.csv.

15.6. Pinning Snapshots and Managing Compliance

As discussed in Section 15.1.2, “Snapshots, Deltas, and Baseline Images”, a specific snapshot, with its complete current fileset, can be associated or pinned to a drift definition. Pinning a snapshot creates an entirely new style of drift definition. Rather than simply tracking changes, a pinned snapshot allows an administrator to establish a clear, blessed configuration for a system or application. It sets a standard with which the system configuration should comply.

15.6.1. More About Pinning Snapshots

A snapshot is a picture of the actual, current files that are on a specific resource. A snapshot is a real-world view. In normal drift conditions, each snapshot is compared to the one immediately before it to show changes. However, it is possible to select a specific snapshot as a fixed baseline to compare changes against. This is a pinned snapshot.
A drift definition sets the rules for running drift detection, but it does not add or define or overwrite any files on a resource. A drift definition does not define content or contain a file set. Content has to be added to a definition (or a definition template). A file set (a snapshot) has to be manually added to the drift definition, after the snapshot exists. This is pinning. Pinning takes a real, existing set of files from a snapshot and links it to a drift definition on a resource or a drift definition template.
Pinning is one method that administrators can use to standardize resource configuration. An administrator can use a single resource as a test box to get a resource's configuration tuned to its ideal settings. Then, that file set can be pinned to a template and re-applied to other resources of the same type. Because the pinned snapshot is based on a real resource, administrators can be confident that the configuration is realistic and functional.
Pinning a snapshot alters some fundamental behaviors with drift management in JBoss ON:
  • It removes any snapshots that were created before that snapshot. For example, if an administrator decides to pin Snapshot 7, Snapshot 0 (the initial image) through Snapshot 6 are all deleted, and Snapshot 7 becomes the new Snapshot 0.
  • It creates a baseline image that every change is compared against rather than keeping a moving tally of changes.
  • It changes the behavior of drift alerts (Section 15.8, “Defining Drift Alerts”) so that alerts are sent continually until the system configuration is back in compliance with the pinned snapshot.
  • The definition it is pinned to cannot be deleted until the snapshot is unpinned.
  • If a snapshot is pinned to a template, then all of the resource-level definitions attached to that template automatically use the pinned snapshot as their baseline.
  • Any new file added after a snapshot is pinned (or any file deleted) is going to be reported as a new file in every subsequent snapshot. This is because the new snapshot is always compared against the baseline snapshot, so the file is always new to the baseline.
    There is some logic to prevent drift from reporting the same change incessantly. If file1.txt is added, the agent creates snapshot 1. When the agent does its next detection run, it recognizes that file1.txt is not in the baseline, but as long as the SHA for file1.txt has not changed, the agent does not report it as new drift and does not take a new snapshot. If file1.txt is modified, however, the agent notices the new SHA and sends a new snapshot — with the modified file1.txt still listed as a new file, because it is compared against the baseline, not the previous version.

15.6.2. When to Pin to a Resource and When to Pin to a Template

When a snapshot perfectly matches the configuration that an administrator desires, it can be associated with a drift definition. That snapshot can be pinned to a resource-level definition or a definition template, and there are slightly different reasons to do one or the other.
  • Pinning a snapshot to a resource-level definition establishes a baseline for that resource alone. This makes sense while you are still developing an ideal baseline image or for unique environments that may not transition over to other resources.
    Pinning to a resource definition allows a lot of flexibility. It is easy to pin and unpin and select a new snapshot as the baseline, to let administrators develop an ideal configuration with a minimal impact on drift events, alerting, and monitoring because the changes are contained.
  • Pinning a snapshot to a template means that baseline can be applied to every resource that uses that template; it allows that one single snapshot to be used across multiple resources. This is makes sense for any kind of repeatable configuration areas and for production or critical systems which must have consistent configuration.
    Pinning to a template is very powerful for maintaining consistency across an entire infrastructure once an ideal configuration has been developed.
Pinning always takes a snapshot that was created on a specific resource and then promotes it to be the baseline for that definition. So the question is — why does a resource-level snapshot need to be pinned to a template? Why can't a template create and use its own snapshot?
The key is to remember that a drift definition template is associated with a resource type. The template is not defined as part of a specific resource.
For a resource-level drift definition, the very first drift detection run creates an initial snapshot based on real and existing files. That initial snapshot can be automatically applied as the baseline, pinned snapshot or any snapshot after the initial can be used as the baseline.
However, a drift definition template (Section 15.3.1, “About Resources and Drift Definition Templates”) is not associated with a resource. Therefore, templates do not have a real set of files to work with and it never has its own snapshots to use. The only way that a drift template can be associated with a snapshot is if a resource-level snapshot is pinned to the template.
In a sense, pinning a snapshot has a backward workflow from defining a drift definition. A definition starts with a template, then moves to a resource-level definition, which generates a snapshot of that resource. Pinning always begins with a snapshot on a resource, and then moves up to a definition or a definition template.
Note
A drift definition sets a very clear and limited set of criteria to use for drift detection. When a snapshot is associated with a drift definition template, the template must use the same settings as the original resource-level drift definition which generated the snapshot. If a matching template does not exist, then a new template can be created, using those criteria.

15.6.3. Pinning to a Resource-Level Definition

  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
  3. Click the Drift tab.
  4. Click the name of the drift definition.
  5. In the snapshot carousel, click the magnifying glass by the name of the snapshot to pin.
    Note
    The initial snapshot is not displayed in the carousel. To pin the initial snapshot, click the thumbtack icon in the Pinned column of the drift definition list. That opens the initial snapshot.
    If a snapshot has already been pinned, then clicking the thumbtack icon opens the pinned snapshot.
  6. At the bottom of the change list, click the Pin to Definition button.

15.6.4. Pinning to a Template

  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
  3. Click the Drift tab.
  4. Click the name of the drift definition.
  5. In the snapshot carousel, click the magnifying glass by the name of the snapshot to pin.
    Note
    The initial snapshot is not displayed in the carousel. To pin the initial snapshot, click the thumbtack icon in the Pinned column of the drift definition list. That opens the initial snapshot.
    If a snapshot has already been pinned, then clicking the thumbtack icon opens the pinned snapshot.
  6. At the bottom of the change list, click the Pin to Template button.
  7. If the resource-level template is based on or attached to an existing template, then you can associate the snapshot with that existing template. If the base directory for the resource-level snapshot does not match any existing drift template, then you must create a new template.

15.6.5. Checking Drift Compliance Reports

The compliance report is a variant of an inventory report. It lists all resources which currently have a drift definition configured and then shows whether they are compliant. Compliance is cumulative; if a resource has multiple drift definitions and is noncompliant on a single one, it will show as non-compliant in the report.
  1. Click the Reports tab in the top navigation menu.
  2. Select the Drift Compliance report from the Inventory report list.
  3. Every resource with a drift definition is listed by type and with an icon to indicate whether it is compliant ( ) or non-compliant ( ).
  4. To get information about the specific resources, click the resource type name; this opens a second inventory report under the main report. All of the resources of that type are listed with their compliance state.
Note
Reports can be exported to CSV, which can be used for office systems or further data manipulation.
To export a report, simply click the Export button. The report will automatically be downloaded as driftCompliance.csv.

15.6.6. Unpinning a Snapshot

A snapshot can be unpinned — or disassociated — from either a resource-level definition or a drift template. Unpinning a snapshot moves the definition back to a rolling drift detection mode, and any resources that were out of compliance are no longer marked as non-compliant.
  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
  3. Click the Drift tab.
  4. Click the pin icon for the drift definition.

15.7. Extended Example: Defining Required EAP Configuration

The Setup

Tim the IT Guy at Example Corp. has one EAP server running in his production environment. Because of the production load, the EAP server was routinely running out of memory, which was degrading its performance and causing downtime for Example Corp.'s website.

To resolve his immediate memory problem, all Tim has to do is change the heap size setting for his EAP instance. However, Tim needs another strategy for managing the configuration long-term. If he adds another production EAP instance or deploys a new one to replace his current one, it is going to hit the same memory-related performance problems without the new heap size setting.

What to Do

There are three things that Tim wants to accomplish to maintain his EAP performance:

  • Find a way to consistently apply configuration to EAP instances.
    He defines a template for JBoss EAP instances (Section 15.3, “Creating a Drift Definition Template”). To maintain consistency, the template sets the Attach to template value to true, and each resource-level drift definition will preserve that settings. This ensures that any changes to the template are automatically applied to the JBoss resource drift definitions.
  • Use his current production settings as a basis for future EAP instances.
    He pins his latest snapshot, with the higher heap settings, to the template definition (Section 15.6.4, “Pinning to a Template”). Every EAP instance is going to be compared against that baseline, so any with the wrong heap setting will immediately be marked out of compliance.
  • Be made aware of specific differences between his current EAP settings and his preferred settings.
    He creates an alert definition (Section 15.8, “Defining Drift Alerts”) which specifically targets the bin/run.conf file. This way, he knows precisely whether the heap settings and other JVM settings are wrong for his new instance. He can even use alerts to gather more information about how his EAP instance configuration is different, like using a CLI script to compare the current EAP configuration against the pinned snapshot and then send him the diff.

Expected Results

Tim brings a new server online, with a new EAP instance for the production environment. He applies the drift template to the new resource and, within a few minutes, receives a notification that his run.conf file is not compliant with his preferred configuration. He changes the heap settings on the new EAP instance without having to wait for performance degradation to remember the change.

15.8. Defining Drift Alerts

Drift changes have their own alert condition.
Note
Recovery alerts are not supported for drift.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the resource name in the list.
  4. Click the Alerts tab for the resource.
  5. In the Definitions subtab, click the New button to create the new alert.
  6. In the General Properties tab, give the basic information about the alert.
    It may be useful to set a Priority if the drift definition contains critical configuration files.
  7. In the Conditions tab, select the Drift Detection option from the conditions list. To use the alert for all drift changes, leave the fields blank. Otherwise, enter the specific drift definition name and (optionally) the directories or files that must be modified for the alert to be triggered.
    Note
    There can be more than one condition set to trigger an alert, meaning that you can use the same alert for multiple drift definitions or files.
  8. In the Notifications tab, click Add to set a notification for the alert.
    Select the method to use to send the alert notification in the Sender option, and fill in the required information.
    The Sender option first sets the specific type of alert method (such as email or SNMP) and then opens the appropriate form to fill in the details for that specific method.
  9. Optionally, in the Dampening tab, give the dampening (or frequency) rule on how often to send notifications for drift.
    Note
    For pinned snapshots, it can be useful to use dampening rules to keep from getting a flood of alerts before a drift problem is remedied.
    Dampening only makes sense for a definition with a pinned snapshot. A pinned definition will fire alerts with every alert scan (every 10 minutes) for as long as it is out of compliance, even if there are no further changes. A rolling definition only fires an alert once, when drift is detected.
    Any of the dampening rules can be used. The ultimate goal is to limit the number of times that the same alert is set for a resource that is out of compliance with a pinned definition. For example, Time period sets a limit on the number of times in a given time period that an alert is issued if the alert condition occurs. Setting the occurrence to 1 and the time period to 4 hours means that when drift is detected once, the server sends an alert and then waits another 4 hours before sending the next alert.
  10. Click OK to save the alert definition.

15.9. Extended Example: Reverting a JBoss Server to Its Original Configuration Using Bundles and Server Scripts

The Setup

In Section 15.7, “Extended Example: Defining Required EAP Configuration”, Tim the IT Guy at Example Corp. set up drift templates and alerts to help manage the configuration on his production EAP servers. However, his resolution was done manually. When the drift alert notified him that his EAP server was out of compliance, he edited the run.conf directly to adjust the heap size.

Manual updates are fine for small infrastructures or infrequent changes. A better management tool, though, is to automate any remediation required for drift.

What to Do

The goal is to have JBoss ON respond intelligently to drift without requiring any action from Tim the IT Guy. There are two features that allow automated responses:

  • Using bundles to provision updated files or applications. A bundle is a ZIP file that contains an Ant recipe and any required content (such as configuration files or JARs) for an application. JBoss ON can provision this content on a platform or a JBoss server in a specified directory.
  • Launching JBoss ON CLI scripts in response to an alert. One of the possible alert notifications is a server-side alert sender. A JBoss ON CLI script is loaded as content and stored in the JBoss ON server; when the alert fires, it initiates the specified, stored CLI script.

There are a few steps to remediation using bundles and CLI scripts:
  1. Create a bundle file based on the pinned snapshot configuration. The content of the bundle depends on the needs of the deployment. It can be specific configuration files, like bin/run.conf, or it can be a full EAP server.
    Note
    If the bundle contains the full EAP server, then it can be used to create the initial EAP server.
  2. Deploy the bundle with the full EAP server to create the new EAP instance. (Or, if the bundle only has configuration files, create the EAP instances.)
  3. Set up the drift definitions, based on the previously configured template (Section 15.7, “Extended Example: Defining Required EAP Configuration”), for the new EAP instance.
  4. Create a JBoss ON CLI script (in JavaScript) that will automatically deploy the specified bundle to the appropriate destination. An example is in Example 15.2, “fix-eap.js Script”; in that script, replace the destinationId and bundleVersionId with the real ID numbers for the destination entry and bundle version entry in JBoss ON.
  5. Create an alert definition that triggers on the drift detection condition and uses the CLI script notification type, pointing to the JavaScript file that you created.

Expected Results

Any time drift is detected on the EAP server, it triggers an alert, same as in Section 15.7, “Extended Example: Defining Required EAP Configuration”. This time, the alert launches the CLI script in response and automatically deploys the bundle — which already has the approved EAP configuration — to the resource. This means that the EAP server is never more than a few minutes out of compliance, roughly the length of one alert scan. All without requiring intervention from Tim the IT Guy.

Example 15.2. fix-eap.js Script

/**
 * If obj is a JS array or a java.util.Collection, each element is passed to
 * the callback function. If obj is a java.util.Map, each map entry is passed
 * to the callback function as a key/value pair. If obj is none of the
 * aforementioned types, it is treated as a generic object and each of its
 * properties is passed to the callback function as a name/value pair.
 */
function foreach(obj, fn) {
  if (obj instanceof Array) {
    for (i in obj) {
      fn(obj[i]);
    }
  }
  else if (obj instanceof java.util.Collection) {
    var iterator = obj.iterator();
    while (iterator.hasNext()) {
      fn(iterator.next());
    }
  }
  else if (obj instanceof java.util.Map) {
    var iterator = obj.entrySet().iterator()
    while (iterator.hasNext()) {
      var entry = iterator.next();
      fn(entry.key, entry.value);
    }
  }
  else {   // assume we have a generic object
    for (i in obj) {
      fn(i, obj[i]);
    }
  }
}

/**
 * Iterates over obj similar to foreach. fn should be a predicate that evaluates
 * to true or false. The first match that is found is returned.
 */
function find(obj, fn) {
  if (obj instanceof Array) {
    for (i in obj) {
      if (fn(obj[i])) {
        return obj[i]
      }
    }
  }
  else if (obj instanceof java.util.Collection) {
    var iterator = obj.iterator();
    while (iterator.hasNext()) {
      var next = iterator.next();
      if (fn(next)) {
        return next;
      }
    }
  }
  else if (obj instanceof java.util.Map) {
    var iterator = obj.entrySet().iterator();
    while (iterator.hasNext()) {
      var entry = iterator.next();
      if (fn(entry.key, entry.value)) {
        return {key: entry.key, value: entry.value};
      }
    }
  }
  else {
    for (i in obj) {
      if (fn(i, obj[i])) {
        return {key: i, value: obj[i]};
      }
    }
  }
  return null;
}

/**
 * Iterates over obj similar to foreach. fn should be a predicate that evaluates
 * to true or false. All of the matches are returned in a java.util.List.
 */
function findAll(obj, fn) {
  var matches = java.util.ArrayList();
  if ((obj instanceof Array) || (obj instanceof java.util.Collection)) {
    foreach(obj, function(element) {
      if (fn(element)) {
        matches.add(element);
      }
    });
  }
  else {
    foreach(obj, function(key, value) {
      if (fn(theKey, theValue)) {
        matches.add({key: theKey, value: theValue});
      }
    });
  }
  return matches;
}

/**
 * A convenience function to convert javascript hashes into RHQ's configuration
 * objects.
 * <p>
 * The conversion of individual keys in the hash follows these rules:
 * <ol>
 * <li> if a value of a key is a javascript array, it is interpreted as PropertyList
 * <li> if a value is a hash, it is interpreted as a PropertyMap
 * <li> otherwise it is interpreted as a PropertySimple
 * <li> a null or undefined value is ignored
 * </ol>
 * <p>
 * Note that the conversion isn't perfect, because the hash does not contain enough
 * information to restore the names of the list members.
 * <p>
 * Example: <br/>
 * <pre><code>
 * {
 *   simple : "value",
 *   list : [ "value1", "value2"],
 *   listOfMaps : [ { k1 : "value", k2 : "value" }, { k1 : "value2", k2 : "value2" } ]
 * }
 * </code></pre>
 * gets converted to a configuration object:
 * Configuration:
 * <ul>
 * <li> PropertySimple(name = "simple", value = "value")
 * <li> PropertyList(name = "list")
 *      <ol>
 *      <li>PropertySimple(name = "list", value = "value1")
 *      <li>PropertySimple(name = "list", value = "value2")
 *      </ol>
 * <li> PropertyList(name = "listOfMaps")
 *      <ol>
 *      <li> PropertyMap(name = "listOfMaps")
 *           <ul>
 *           <li>PropertySimple(name = "k1", value = "value")
 *           <li>PropertySimple(name = "k2", value = "value")
 *           </ul>
 *      <li> PropertyMap(name = "listOfMaps")
 *           <ul>
 *           <li>PropertySimple(name = "k1", value = "value2")
 *           <li>PropertySimple(name = "k2", value = "value2")
 *           </ul>
 *      </ol>
 * </ul>
 * Notice that the members of the list have the same name as the list itself
 * which generally is not the case.
 */
function asConfiguration(hash) {

 config = new Configuration;

 for(key in hash) {
  value = hash[key];

  if (value == null) {
   continue;
  }

  (function(parent, key, value) {
   function isArray(obj) {
    return typeof(obj) == 'object' && (obj instanceof Array);
   }

   function isHash(obj) {
    return typeof(obj) == 'object' && !(obj instanceof Array);
   }

   function isPrimitive(obj) {
    return typeof(obj) != 'object';
   }

   //this is an anonymous function, so the only way it can call itself
   //is by getting its reference via argument.callee. Let's just assign
   //a shorter name for it.
   var me = arguments.callee;

   var prop = null;

   if (isPrimitive(value)) {
    prop = new PropertySimple(key, new java.lang.String(value));
   } else if (isArray(value)) {
    prop = new PropertyList(key);
    for(var i = 0; i < value.length; ++i) {
     var v = value[i];
     if (v != null) {
      me(prop, key, v);
     }
    }
   } else if (isHash(value)) {
    prop = new PropertyMap(key);
    for(var i in value) {
     var v = value[i];
     if (value != null) {
      me(prop, i, v);
     }
    }
   }

   if (parent instanceof PropertyList) {
    parent.add(prop);
   } else {
    parent.put(prop);
   }
  })(config, key, value);
 }

 return config;
}

/**
 * Opposite of <code>asConfiguration</code>. Converts an RHQ's configuration object
 * into a javascript hash.
 *
 * @param configuration
 */
function asHash(configuration) {
 ret = {}

 iterator = configuration.getMap().values().iterator();
 while(iterator.hasNext()) {
  prop = iterator.next();

  (function(parent, prop) {
   function isArray(obj) {
    return typeof(obj) == 'object' && (obj instanceof Array);
   }

   function isHash(obj) {
    return typeof(obj) == 'object' && !(obj instanceof Array);
   }

   var me = arguments.callee;

   var representation = null;

   if (prop instanceof PropertySimple) {
    representation = prop.stringValue;
   } else if (prop instanceof PropertyList) {
    representation = [];

    for(var i = 0; i < prop.list.size(); ++i) {
     var child = prop.list.get(i);
     me(representation, child);
    }
   } else if (prop instanceof PropertyMap) {
    representation = {};

    var childIterator = prop.getMap().values().iterator();
    while(childIterator.hasNext()) {
     var child = childIterator.next();

     me(representation, child);
    }
   }

   if (isArray(parent)) {
    parent.push(representation);
   } else if (isHash(parent)) {
    parent[prop.name] = representation;
   }
  })(ret, prop);
 }
 (function(parent) {

 })(configuration);

 return ret;
}

/**
 * A simple function to create a new bundle version from a zip file containing
 * the bundle.
 * 
 * @param pathToBundleZipFile the path to the bundle on the local file system
 * 
 * @return an instance of BundleVersion class describing what's been created on 
 * the RHQ server.
 */
function createBundleVersion(pathToBundleZipFile) {
 var bytes = getFileBytes(pathToBundleZipFile)
 return BundleManager.createBundleVersionViaByteArray(bytes)
}

/**
 * This is a helper function that one can use to find out what base directories
 * given resource type defines.
 * <p>
 * These base directories then can be used when specifying bundle destinations.
 * 
 * @param resourceTypeId
 * @returns a java.util.Set of ResourceTypeBundleConfiguration objects
 */
function getAllBaseDirectories(resourceTypeId) {
 var crit = new ResourceTypeCriteria;
 crit.addFilterId(resourceTypeId);
 crit.fetchBundleConfiguration(true);
 
 var types = ResourceTypeManager.findResourceTypesByCriteria(crit);
 
 if (types.size() == 0) {
  throw "Could not find a resource type with id " + resourceTypeId;
 } else if (types.size() > 1) {
  throw "More than one resource type found with id " + resourceTypeId + "! How did that happen!";
 }
 
 var type = types.get(0);
 
 return type.getResourceTypeBundleConfiguration().getBundleDestinationBaseDirectories();
}

/**
 * Creates a new destination for given bundle. Once a destination exists,
 * actual bundle versions can be deployed to it.
 * <p>
 * Note that this only differs from the <code>BundleManager.createBundleDestination</code>
 * method in the fact that one can provide bundle and resource group names instead of their
 * ids.
 * 
 * @param destinationName the name of the destination to be created
 * @param description the description for the destination
 * @param bundleName the name of the bundle to create the destination for
 * @param groupName name of a group of resources that the destination will handle
 * @param baseDirName the name of the basedir definition that represents where inside the 
 *                    deployment of the individual resources the bundle will get deployed
 * @param deployDir the specific sub directory of the base dir where the bundles will get deployed
 * 
 * @return BundleDestination object
 */
function createBundleDestination(destinationName, description, bundleName, groupName, baseDirName, deployDir) {
 var groupCrit = new ResourceGroupCriteria;
 groupCrit.addFilterName(groupName);
 var groups = ResourceGroupManager.findResourceGroupsByCriteria(groupCrit);
 
 if (groups.empty) {
  throw "No group called '" + groupName + "' found.";
 }
 
 var group = groups.get(0);
 
 var bundleCrit = new BundleCriteria;
 bundleCrit.addFilterName(bundleName);
 var bundles = BundleManager.findBundlesByCriteria(bundleCrit);
 
 if (bundles.empty) {
  throw "No bundle called '" + bundleName + "' found.";
 }
 
 var bundle = bundles.get(0);
 
 return BundleManager.createBundleDestination(bundle.id, destinationName, description, baseDirName, deployDir, group.id);
}

/**
 * Tries to deploy given bundle version to provided destination using given configuration.
 * <p>
 * This method blocks while waiting for the deployment to complete or fail.
 * 
 * @param destination the bundle destination (or id thereof)
 * @param bundleVersion the bundle version to deploy (or id thereof)
 * @param deploymentConfiguration the deployment configuration. This can be an ordinary
 * javascript object (hash) or an instance of RHQ's Configuration. If it is the former,
 * it is converted to a Configuration instance using the <code>asConfiguration</code>
 * function from <code>util.js</code>. Please consult the documentation of that method
 * to understand the limitations of that approach.
 * @param description the deployment description
 * @param isCleanDeployment if true, perform a wipe of the deploy directory prior to the deployment; if false,
 * perform as an upgrade to the existing deployment, if any
 * 
 * @return the BundleDeployment instance describing the deployment
 */
function deployBundle(destination, bundleVersion, deploymentConfiguration, description, isCleanDeployment) {
 var destinationId = destination;
 if (typeof(destination) == 'object') {
  destinationId = destination.id;
 }
 
 var bundleVersionId = bundleVersion;
 if (typeof(bundleVersion) == 'object') {
  bundleVersionId = bundleVersion.id;
 }
 
 var deploymentConfig = deploymentConfiguration;
 if (!(deploymentConfiguration instanceof Configuration)) {
  deploymentConfig = asConfiguration(deploymentConfiguration);
 }
 
 var deployment = BundleManager.createBundleDeployment(bundleVersionId, destinationId, description, deploymentConfig);
 
 deployment = BundleManager.scheduleBundleDeployment(deployment.id, isCleanDeployment);
 
 var crit = new BundleDeploymentCriteria;
 crit.addFilterId(deployment.id);
 
 while (deployment.status == BundleDeploymentStatus.PENDING || deployment.status == BundleDeploymentStatus.IN_PROGRESS) {
  java.lang.Thread.currentThread().sleep(1000);
  var dps = BundleManager.findBundleDeploymentsByCriteria(crit);
  if (dps.empty) {
   throw "The deployment disappeared while we were waiting for it to complete.";
  }
  
  deployment = dps.get(0);
 }
 
 return deployment;
}


var destinationId = 10002;
var bundleVersionId = 10002;
var deploymentConfig = null;
var description = "redeploy due to drift";
// NOTE: It's essential that isCleanDeployment=true, otherwise files that have drifted will not be replaced with their
//       original versions from the bundle.
var isCleanDeployment = true;
deployBundle(10002, 10002, deploymentConfig, description, true);

15.10. Running Drift Detection Manually

The drift detection scan runs periodically, according to the interval set in the definition. (The default is 1800 seconds, or 30 minutes.) There can be times when you know that files in the directory have changed and you need a snapshot to be created immediately, but you do not want to change the interval permanently. Simply run a detection scan manually.
  1. Click the Inventory tab in the top menu.
  2. Search for the resource.
  3. Click the Drift tab.
  4. Select the drift definition to run the scan for.
  5. Click the Detect Now button.

15.11. Setting Planned Changes or Disabling Drift Definitions

The assumption behind drift monitoring is that there is an identified and specific configuration for a platform or application and that that configuration should be preserved. Changes, therefore, are undesirable and need to be monitored.
However, there can be times when changes are expected, such as scheduled maintenance and upgrade periods. In that situation, it's beneficial to suspend drift monitoring to keep from creating unnecessary static with drift alerts.
There are two ways to suspend drift monitoring:
  • Set the drift handling mode to planned changes. This keeps running drift detection scans and records changes. Since the changes are expected, though, it doesn't trigger a drift detection event, so it does not issue a drift alert.
  • Actually disable the drift definition. This suspends the drift detection runs for the definition, not just drift events.
The drift handling mode and the enable option for the drift definition can be edited in the definition entry, as in Section 15.4, “Editing Drift Definitions”.

Figure 15.5. Drift Handling Mode and Enable Options

Drift Handling Mode and Enable Options

15.12. Changing How Long Drift Snapshots Are Stored

Drift snapshots are stored within the JBoss ON database for a limited period of time (31 days). This allows enough time to remediate any unauthorized changes, but maintains some resource limits on how much data is stored.
Any unused snapshots are removed once the time limit is reached. Unused snapshots are snapshots which are not pinned or which are associated with a disabled or deleted drift definition (orphaned).
Baseline snaphots (snapshot 0) and pinned snapshots are always saved.
  1. In the Configuration menu, select the System Settings item.
  2. Scroll to the Data Manager Configuration Properties section.
  3. Change the storage times for the drift snapshots. Unused snapshots are not pinned or a baseline, while orphaned snapshots are related to disabled definitions.

15.13. Understanding Drift and JBoss ON Agents and Servers

15.13.1. Drift Inventory

Both the JBoss ON agent and the JBoss ON server maintain their own inventories of the resources, directories, and files monitored for drift. When the agent starts up, it compares its inventory with the server inventory.
The drift information is stored, with the other agent data, in the agentRoot/rhq-agent/data/ directory. The information in this directory is deleted if the agent is started with new configuration (--cleanconfig) or it can be intentionally purged (--purgedata). If the drift information is lost, then the agent requests the last snapshot from the JBoss ON server.
The agent always sends the latest changeset to the server as a snapshot. If the server is offline for some period and misses updates, then the agent sends the most current snapshot, which effectively rolls all changes into one snapshot, even if the changes accumulated over several drift detection runs.

15.13.2. The Drift Server Plug-in

The server processes drift changes through a server-side plug-in. This plug-in must be enabled for the server to recognize and process drift data sent from the agent.
As with other server-side plug-ins, the drift plug-in can be disabled. However, this effectively and entirely disables drift monitoring on that server, and no drift information is processed or stored. That is slightly different than the behavior of other server subsystems. For example, an individual alert sender can be disabled, but alert detections are still run and alert information is still processed, stored, and displayed by the JBoss ON server.
Warning
If the drift server-side plug-in is disabled, then the server ignores any incoming drift reports. Even if the drift server-side plug-in is re-enabled, any information sent while the plug-in was disabled is lost.


[3] JBoss ON detects that changes have been made to a binary file. It does not display binary files or compare or diff changes between versions for binary files, only text files.

Part III. Monitoring

Chapter 16. Introduction: Monitoring and Responding to Resource Activity

One of the core functions of JBoss Operations Network is that it lets administrators stay aware of the state of their JBoss servers, platforms, and overall IT environment.
The current state of individual servers and applications provides critical information to IT staff about traffic and usage, equipment failures, and server performance. JBoss Operations Network can supply a clearer picture of these critical data by automatically monitoring resources in its inventory.
The most powerful aspect of management is the ability to know, accurately, where your resources are and to respond to that ever-changing situation reliably.

16.1. Monitoring and Types of Data

Monitoring gives insight into how a specific machine, application, or service is performing. JBoss ON collects different types of information from different native and external sources for its managed resources.
JBoss Operations Network is not a real-time monitor or an archive of data points or a profiler. What JBoss ON does is, in essence, filter and process raw data so that long-term trends, operating parameters, and performance histories — the purpose of monitoring — are clear and accessible from the data. JBoss ON uses schedules to define what information to gather and how frequently (anywhere from 30 seconds to hours). This prioritizes the performance information for a resource and makes important information more visible and coherent.
Although the precise information gathered is different depending on the resource type, there are a few broad categories of monitoring data. Each category obtains information from a different place and is useful to determine a different aspect of resource behavior.
Availability or "up and down" monitoring
This is both basic and critical. Availability is status information about the resource, whether it is running or stopped.
Numeric metrics
Metrics are the core performance data for a resource. Almost every software product exposes some sort of information about itself, some measurable facet that can be checked. This is usually This numeric information is collected by JBoss ON, on defined schedules.
Metric information is processed by the server. There are three states of the monitoring data used:
  • Raw data, which are the readings collected on schedule by the agent and sent to the server
  • Aggregated data, which is compressed data processed by the server into 1-hour, 6-hour, and 24-hour averages and used to calculate baselines and normal operating ranges for resources. These aggregated data are the information displayed in the monitoring graphs and returned in the CLI as metrics.
  • Live values, which are ad hoc requests for the current value of a metric.
    Metric values are rolling live-streams of the resource state; they are essentially snapshots that the agent takes of the readings on predefined schedules. Those data are then aggregated into means and averages to use to track resource performance.
    Live values are immediate, aggregated, current readings of a metric value.
Metric information is especially important because it is collected and stored long-term. This allows for historical views on resource performance, as well as recent views.
Logfile messages (events)
While JBoss ON is not a log viewer, it can monitor specified logs and check for important log messages based on severity or strings within the log messages. This is event monitoring, and it allows JBoss ON to identify incidents for a resource and to send an alert notification and, if necessary, take corrective action based on dynamic information outside normal metrics.
Response time metrics
Certain types of resources (URLs for web servers or session beans) depend on responsiveness as a component of overall performance. Response time or call-time data tracks how quickly the URL or session bean responds to client requests and helps determine that the overall application is performant.
Descriptive strings (traits)
Most resources have some relatively static information that describe the resource itself, such as an instance name, build date, or version number. This information is a trait. As with other attributes for a resource, this can be monitored. Traits are useful to identify changes to the underlying application, like a version update.

16.2. Alerts and Responses to Changing Conditions

A critical part of monitoring is being aware of when undesirable events occur. Alerting works with other functions in JBoss ON management (monitoring data and configuration drift detection) to define conditions for triggering an alert.
When an alert condition is met, alerting in JBoss ON serves two important functions:
  • Alerts communicate that there has been a problem, based on parameters defined by an administrator.
  • Alerts respond to incidents automatically. Administrators can automatically initiate an operation, run a JBoss ON CLI script to change JBoss ON or resource configuration, redeploy content, or run a shell script, all in response to an alert condition.
    Automatic, administrator-defined responses to alerts make it significantly easier for administrators to address infrastructure problems quickly, and can mitigate the effect of outages.
Alerts are based on metrics information, call-time data, availability, and events, all normal monitoring elements. Alerting can also be based on critical changes to a resource, defined in drift definitions that track configuration drift. Tracking configuration for resources along with monitoring data lets administrators remedy unplanned or undesirable system changes easily and consistently.

16.3. Potential Impact on Server Performance

Theoretically, there is no limit to the number of metrics that can collected or the number of alerts that can be fired.
In reality, there are natural constraints within the IT environment that limit both monitoring and alert settings:
  • Database performance, which is the primary factor in most environments
  • Network bandwidth
There are no hard limits on JBoss ON's alerting and monitoring configuration since it depends on the number of resources, number of metrics, collection frequency, and the number of alerts.
As a rule of thumb, there are these performance thresholds:
  • Up to 30,000 metrics can be collected per minute
  • Up to 100,000 alerts can be fired per day (roughly 70 per minute)
Plan how to implement metrics collection and alerting. Prioritize resources and then the information required from those resources when enabling metrics schedules and setting collection frequencies. Then, based on those priorities, plan what alerts are required.
Clear monitoring and alerting strategies can help maintain performance while still gathering critical information.
Note
Additional storage nodes can be added to the storage cluster to extend the amount of metrics collected.
The storage nodes can be permanent, as part of the JBoss ON design, or can be created dynamically, using an alert condition on the existing nodes to trigger a new node deployment.
Creating additional nodes is covered in Section 24.3, “Deploying and Managing Storage Nodes”.

16.4. Differences with Monitoring Based on Different Resource Types

Available metrics, events, traits, and other monitoring settings are defined for each resource type in its plug-in descriptor.
Obviously, software of completely different types have different possible monitoring configuration.
However, monitoring settings can be different between releases of the same software. Either different metrics are available or the same metric may have different configuration names. For example, JBoss EAP 4 and 5 have the same metrics, related to monitoring the EAP server JVM, threads, and transactions. Because of the different management structure in JBoss EAP 6, there are different metrics, related to management requests between the servers in the EAP 6 domain.
The Resource Reference: Monitoring, Operation, and Configuration Options has a complete references of available metrics for the official JBoss ON agent plug-ins. Check this guide to see what differences there are between release versions.

Chapter 17. Monitoring Reports and Data

Monitoring information in JBoss ON is easy to find. Resources and groups both have dashboards which contain snapshot views of the most recent metrics values and a series of graphs and tables which break down the different metrics values over a given time window.
Monitoring information is available in several areas:
  • Dashboards with metrics portlets for individual resources, compatible groups, and the main dashboard
  • Timelines, which aggregate all collected data, events, configuration, operations, and other changes for a resource
  • Resource-level charts and tables for metrics
  • A suspect metrics report for outlier or out-of-bounds metrics

17.1. Dashboards and Portlets

The fastest place to view monitoring and alerting information is through one of the JBoss ON dashboards. The dashboards collect almost all monitoring, event, alert, and operations data into a single location.
Each data set is collected in a separate box or portlets displayed in the dashboard. These portlets can be edited, added, and removed from the dashboards; this is covered in the Admin: Initial Setup for the Resource Inventory, Groups, and Users.

17.1.1. Resource-Level Dashboards

The Summary > Activity tab for an individual resource (or compatible group) shows a snapshot of all recent actions on the resource, such as new packages and content, inventory changes, events, operations, and alerts. There is also a portlet that displays the most recent detected value of the primary metrics for the resource.

Figure 17.1. Resource Summary Tab

Resource Summary Tab
Click on any metric name in the Resource: Measurements portlet opens the metric graph. Clicking the see more... link opens the metrics charts in the Monitoring tab.

17.1.2. Main Dashboard

The Dashboard main page has a global view of all resources in the inventory. By default, this page shows only alerting data and unavailable resources. However, the Dashboard can be customized to show different portlets of monitoring data. Additionally, the main page can have multiple dashboards, so a dashboard can be created to look at different metrics for the same resource, the same metrics for different resources, or a combination of relevant metrics for a group of related resources — whatever you design.
The main dashboard has several types of portlets specifically for monitoring data:
  • Platform Utilization, which shows free memory, CPU usage, and other metrics related to platform performance.
  • Alerted or Unavailable Resources, which shows a list of the most recent five resources which have issued an alert or been reported as down.
  • Recent Alerts and Events that can be filtered by Date/Time, Name and Priority.
  • A graph for a specific metric for a compatible group.
  • A graph for a specific metric for a resource.

Figure 17.2. Dashboard Portlets with MOnitoring Data

Dashboard Portlets with MOnitoring Data

17.1.3. Adding Monitoring Metrics to the Main Dashboard

Charts for a specific metric for a resource can be added to the Dashboard. This makes it easier to see the current state of important readings for common or critical resources immediately, without having to configure alerts or check resource entries.

Procedure 17.1. To add monitoring metrics to the dashboard

  1. Click the Inventory tab in the top menu, and navigate to the Resource.
  2. In the Monitoring tab, select the Metrics sub-tab.
  3. Select the metric from the list, and then click Add to Dashboard
A chart for that specific metric on that specific resource is automatically added to the Dashboard that was selected.

17.2. Summary Timelines

The Timeline subtab in the Summary tab shows a line chart of all of the activity for the resource (with the exception of metrics collection, which is all under the Monitoring tab and charts). The Timeline aggregates all configuration changes, inventory changes, drift, events, content and bundle changes, operations, and alerts. Clicking any given point opens up the details for that specific action.

Figure 17.3. Summary Timeline

Summary Timeline
Because all information is on a single timeline, it becomes must easier to correlate incidents and events and to get a better understanding of the overall activity on that resource.

17.3. Resource-Level Metrics Charts

The Monitoring tab for a resource (or for a compatible group) has a series of different subtabs, each marking a different type of monitoring data. Each data group has its own monitoring charts.
The Metrics subtab initially displays a table, laid out with the maximum, minimum, and average values for the baseline period for the metric, the most recent reading, and a trendline. Expanding any given metric shows a graph with the same information, displayed visually.

Figure 17.4. Metrics Chart

Metrics Chart
Hovering over any given data point shows the value of that aggregated point (which corresponds to the time period for the size of the graph: the average value for one-sixtieth, 1/60, of the total graphed period; for example, for a 60-hour graph, a data point is one hour).

Figure 17.5. Hovering over a Data Point

Hovering over a Data Point
There are buttons at the top of the Metrics tab to change the date range for the displayed data. Additionally, it is possible to drag over a portion of the graph and generate a new graph, based on the free selected range.

Figure 17.6. Selecting a Subset of the Graph

Selecting a Subset of the Graph

17.4. Suspect Metrics Report

As described in Section 19.1.4, “Baselines and Out-of-Bounds Metrics”, once metrics have been collected a few times, JBoss ON begins calculating a normal operating range for that specific resource and that specific metric. This creates a range based on the lowest and highest values.
If a metric data point comes in that is outside that normal range, higher or lower, that is a suspect metric. It could be a fault of the metric collection or it could indicate a resource problem.
Each individual resource has a portlet on its Summary tab which lists suspect, or out-of-bounds, metrics.

Figure 17.7. Out of Bounds Portlet

Out of Bounds Portlet
All resources, across the inventory, which have a suspect metric are listed in the Suspect Metrics report with the metric, its normal range, its suspect reading, and the factor or percentage of how far outside the metric is from normal readings.

Figure 17.8. Suspect Metrics Reports

Suspect Metrics Reports
Note
Reports can be exported to CSV, which can be used for office systems or further data manipulation.
To export a report, simply click the Export button. The report will automatically be downloaded as suspectMetrics.csv.

17.5. Platform Utilization Report

For general infrastructure monitoring, the primary resource is the platform. The Platform Utilization report shows a very quick snapshot on the health of every platform in the inventory by showing its current system performance, in three metrics:
  • Current CPU percentage
  • The actual memory usage, based on the available physical memory, buffer, and cache
  • Swap

Figure 17.9. Platform Utilization Report

Platform Utilization Report
Note
This report can also be added to the main Dashboard or a resource-level Summary dashboard as a portlet.
There are a couple of caveats. Only available platforms are listed. Other platforms in the inventory that are not in an available state are not listed. Also, the utilization is based on the most recent live data, not averages or historical values. It provides an immediate look at the platform resources.
Note
Reports can be exported to CSV, which can be used for office systems or further data manipulation.
To export a report, simply click the Export button. The report will automatically be downloaded as platformUtilization.csv.

Chapter 18. Availability

One of the most basic elements for monitoring is knowing whether your server or application is running. Availability monitoring tells administrators that a certain process is running and minimally responsive.

18.1. Core "Up and Down" Monitoring

The first question with monitoring is is the resource running? A resource's availability is the first thing to check for overall performance, for determining service levels, and for maintaining infrastructure.
Availability (sometimes called up or down monitoring) determines whether a resource is up or whether it is in some other state.
Up means that the resource is running and that it responds to the agent within a prescribed time.
How availability is determined depends on the resource; it could be checking a process ID or a JVM or something else. Availability for a resource type is defined in its plug-in descriptor. Therefore, the plug-in container is the intermediary between the resource and the agent. The agent checks the plug-in container for resource availability; the container obtains it from the resource component.
Usually, an availability check takes a fraction of a second; for certain types of resources or in certain environments, it could take longer. There is a timeout period for availability scans, set to five (5) seconds by default. If a resource is running and responds to the availability scan within that five-second window, the resource is up.
Because availability — or "up and down" — monitoring is so critical to IT administrators, availability states in JBoss ON are highly visible. Availability is displayed on resource details pages, in every list of resources, in groups, and in monitoring reports. The idea is that it should only take a glance to be able to determine whether your resource is up.

Figure 18.1. Resource Availability

Resource Availability
Even though availability is not a true monitoring metric, the Monitoring > Metrics page even shows the percentage of time, within the display time period, that the resource has been in an up state. This is because availability (and concomitant uptime) impacts every other metric collected by the agent.

Figure 18.2. Availability Uptime Percentage

Availability Uptime Percentage
Note
Often, if a resource shows down availability even when it is running, it is a problem with the connection settings. The agent may not have information it requires, such as a username or new port number, that it requires to connect to the resource. Since the agent cannot connect to the resource, it assumes it is down.

18.1.1. Long Scan Times and Async Availability Collection

Availability scans are performed by a resource plug-in itself, for its defined resource types, and then reported to the plug-in container.
Availability checks are typically very fast, fractions of a second, but there can be situations where an availability check takes longer. The plug-in container limits how long an availability check can run to five seconds, to prevent a rogue plug-in from delaying availability reporting for all other resources managed by the agent.
There can be instances where a certain plug-in or resource type consistently has scans longer than the five-second timeout period.
For custom plug-ins, plug-in writers can configure asynchronous availability checking. Basically, with async availability checks, the resource component creates its own, independent thread to run availability checks. Within that thread, the availability checks can take as long as they need to complete. The availability checks can also be run fairly frequently, every minute by default, to make sure that the availability state is current, even if the full check takes longer to complete.
The component caches and then reports the most recent availability result to the plug-in container. That stored last availability can be delivered very quickly, in the fractions of a second that the plug-in container expects.
Async availability checks are implemented through the AvailabilityCollectorRunnable class in the JBoss ON plug-in API. Details for this class are available in the Plug-in API and Writing Custom Plug-ins Guides.
Note
It is also possible to address long availability check times by extending the scan timeout period in the agent configuration itself. For example, add a new timeout period to the ADDITIONAL_JAVA_OPTIONS parameters in the rhq-agent-env.sh file:
RHQ_AGENT_ADDITIONAL_JAVA_OPTS="$RHQ_AGENT_ADDITIONAL_JAVA_OPTS -Drhq.agent.plugins.availability-scan.timeout=15000"
However, that timeout period applies to the entire plug-in container, not just one specific, slow-running plug-in. If there are several plug-ins that are running sluggish availability checks, then the availability report may take too long to complete, causing the agent to delay or even miss sending availability reports to the JBoss ON server.
Generally, it is preferable to configure async availability on a custom plug-in, rather than trying to reset the scan interval for all plug-ins.

18.1.2. Synchronous Availability

Availability scans are run on defined schedules, anywhere from every minute to every 20 minutes by default. This means that most availability data is asynchronous — it is displayed in the availability timeline, in reports, and in most of the UI based on the most recent (but not necessarily current) value.
There is one way to get synchronous, near-real time availability information: by viewing the resource's Monitoring tab. As long as the Monitoring tab is open, the availability reading is checked every 15 seconds, rather than their configured collection schedule. This is as close as possible to real-time availability information.
Note
This altered schedule for collecting availability could impact dampening rules for any alerts for a resource.
For example, if availability is scheduled to be checked every 10 minutes with dampening set to fire an alert after three (3) occurrances of a certain state, then an alert could fire after less than a minute if a certain state is read — even though the intent of the alert is to fire only after half an hour of the condition persisting.

18.1.3. Availability States

There is a gray area between up and not up. While a resource may not be up, it may be not up for different reasons. For instance, an agent could have been restarted, so no resource states are known. Or a resource may have been taken offline for maintenance, so no availability reports are being sent.
The different resource states are listed in Table 18.1, “Availability States”.

Table 18.1. Availability States

State Description Icon
Available (UP) The resource is running and responding to availability status checks.
Down The resource is not responding to availability checks.
Unknown The agent does not have a record of the resource's state. This could be because the resource has been newly added to the inventory and has not had its first availability check or because the agent is down.
Disabled The resource has been administratively marked as unavailable. The resource (in reality) could be running or stopped. Disabling a resource means that the server ignores the availability reports from the agent to prevent unnecessary alerts based on a (known) down or cycling state.
Mixed (For groups only.) [a]
The resources in a group have different availability states.
[a] A similar warning sign can be displayed next to the resource availability at the top of the resource details page. That warning indicates that an error message or suspect metric has been returned for that resource, not that the resource's availability is in a warning state.

18.1.4. Parent-Child States and Backfilling

Availability is assessed from the top of the resource tree downward. For example, if an application server is down, it is safe to assume that all of its dependent webapp children are also down.
This is called backfilling. The parent's state is propagated to its children without running additional availability scans for each child. Backfilling can set children to down, unknown, or disabled states.
In some cases, backfilling even includes up states. Some dependent child resources (low priority services that only run if the parent is running) may not even have their own availability assessed independently by default. When a child's availability checking is disabled, the child presumptively uses its parent's state. If the parent is up, those children are assumed to be up.
There is one slight variation on backfilling — if a platform is marked as down. A platform being down is the same as the agent being down. It means that the agent has not reported to the server. There could be a number of reasons for that, apart from any servers or services actually being offline. In this case, the platform (functionally, the agent) is set to down, but its children are set to unknown.

18.1.5. Collection Intervals and Agent Scan Periods

As alluded to, an availability reading is not the same as a metric collection. There are some superficial similarities, mainly in that they both are collected on schedules and that they both relate to resource performance.
Internally, availability and metrics are treated differently. Availability is called through different functions and reported separately, and, more important, availability reports are prioritized higher than other reports sent by the agent, including monitoring reports.
While availability reports are sent as first priority messages, resources themselves have different priorities for availability scans. Higher priority (more critical) resources are, by default, checked for availability more frequently:
  • An agent heartbeat ping (analogous to the platform's availability) is sent to the server every minute.
  • Server availability is checked every minute.
  • Service availability is checked every 10 minutes.
The agent itself runs an availability scan at 30-second intervals. Not every resource is checked with every scan. When the agent scan runs, only those resources scheduled to be checked are checked. So, there are functionally two availability schedules working together in tandem, the agent scan interval and the resource collection schedule. For example, if a server is configured with a 60-second interval for availability checks and the agent scan period is 30 seconds, the server is eligible to be checked every two scans. That means that the server is checked roughly every 60 seconds, but that is a best effort estimate; if the agent is under a heavy load or if there are a large number of resources, the agent may run its scans longer than every 30 seconds, so the actual interval between checks for a specific resource would be longer.
The agent only sends an availability report to the server if there is an availability state change for one of its managed resources.
If an agent goes down suddenly, it shows a down state within five minutes, the (default) agent quiet period. If the agent shuts down gracefully, the JBoss ON server recognizes the state change within about a minute. Once the server recognizes the agent is down, it begins backfilling the states of all of the resources in that agent's inventory (Section 18.1.4, “Parent-Child States and Backfilling”).
Down servers typically record a down state between one and two minutes after going down. This is not exactly real-time, but it is close enough for most infrastructure to be able to establish a reliable baseline of performance and even calculate service levels and uptime. A short window of 90 seconds can catch most resource cycling.
The default agent scan interval is 30 seconds, but, depending on a resource schedule, it could be over 10 minutes before some services are detected as down. If an administrator suspects that there has been a state change, it is possible to force an immediate availability scan for all resources for the agent through the interactive agent prompt:
> avail -- force
Using simply the avail command runs the check for the next scheduled resources, not all resources.
Additionally, resource plug-ins can be written so that any operation which could cause a state change (such as start, stop, and restart operations) automatically requests an availability check for the resource when the operation ends.

18.2. Viewing a Resource's Availability Charts

  1. Click the Inventory tab in the top menu.
  2. Select the resource category, such as servers or services, in the Resources menu table on the left. Then browse or search for the resource.
  3. Click the name of the resource in the list.
  4. Open the resource's Monitoring tab.
The Availability chart for a resource shows when, and for how long, a resource changes states. This includes timestamps of whenever the availability changes and total counts of how much time the resource spends in the up and down states.

Figure 18.3. Availability Charts

Availability Charts

18.3. Detailed Discussion: Availability Duration and Performance

Availability as a monitoring mechanism has two important facets: the immediate effect of when it changes and then the historic perspective on how changes in availability reflect resource performance.
An historic perspective introduces the idea of availability duration. How long was a resource in a particular state? How often does it change?

Figure 18.4. Availability Counts

Availability Counts
The idea of availability duration is important to get an accurate picture of how a resource is performing. There are several ways that JBoss ON breaks out that information:
  • Total time in up, down, and disabled states
  • Percentage of time time in up, down, and disabled states
  • The number of times the resource has been in a down or disabled state
  • The mean time between failures (MTBF) and mean time to recovery (MTTR)
Note
Unknown states are not included in calculating the resource's overall availability history.
The last element is particularly important in assessing the resource's performance in light of its availability. The mean time between failures is the time between when a resource comes up and when it next goes down — it is the mean [4] of all of its up periods. This gives an idea of how stable a system is. The mean time to recovery gives an idea of how long the resource stays down, which indicates its resilience or fault tolerance. A low MTBF and high MTTR indicate some potential maintenance problems or application instability on a resource.

Figure 18.5. Up and Down Monitoring

Up and Down Monitoring
From a monitoring perspective, the historic perspective is critical, particularly when planning equipment replacements and upgrades.
From an alerting perspective — from an immediate response perspective — only availability changes matter.
The first and most obvious alert condition issues an alert based solely on a state change.
However, resources can cycle or can have a few seconds or minutes where they are inaccessible but that doesn't affect the overall performance of the resource or of whatever function it performs. A resource hits a certain state and has to stay there for a certain amount of time before the state becomes important.

Figure 18.6. Availability Duration Alert

Availability Duration Alert
Note
An availability alert does not lend itself to dampening, because the state changes and then stays, such as an availability alert that fires when the resource changes to a down state. If a resource is cycling, it may go down and up several times, each time triggering a new alert, but it may all be related to the same performance issue on the resource.
Instead of dampening, a disable setting on the alert will fire the alert once, then disable that alert definition until it is acknowledged by an administrator, as described in Section 25.2.5, “Detailed Discussion: Automatically Disabling and Recovering Alerts”. (In this case, do not set a corresponding recover setting; otherwise, if the resource is cycling, every UP reading would reset the alert and then the next DOWN report would fire another notification — essentially undoing the dampening effect of disabling the alert until acknowledgment.)


[4] This is mean in the statistical sense. It is the middle data point of all collected uptime lengths.

18.4. Detailed Discussion: "Not Up" Alert Conditions

There are four possible availability states for a resource:
  • Up
  • Down
  • Unknown
  • Disabled
Since one of the core monitoring factors for a resource is knowing its availability, alerts can be defined on any availability state change.
Generally, the condition can be set to send an alert on any explicit state. For example, a goes down condition alerts only when the availability state changes to DOWN. Any other state change is ignored.

Figure 18.7. Availability Change Conditions

Availability Change Conditions
For critical platforms or resources, however, any change in availability other than UP may need to trigger an alert. Even known state changes like DISABLED.
The goes not up condition triggers an alert if there is a change to any availability state other than UP, so it is a logical OR combination of DOWN, UNKNOWN, and DISABLED conditions.
Note
Availability change conditions are well suited to using recovery alerts. When a resource goes down (or not up) an alert can fire that informs the administrators and then enables (or recovers) a companion alert that will inform them when the resource is available again.

18.5. Viewing Group Availability

To view group availability:
  1. Click the Inventory tab in the top menu.
  2. Select the compatible or mixed groups item in the Groups menu on the left.
  3. Click the name of the group.
  4. Click the Inventory tab for the group.
Group availability is a composite of the states of its member resources. If all resources are in one state or another, the group as a whole is in that state. If the resources are in different states, then the group state is determined based on the mix of resource states.

Figure 18.8. Group Availability

Group Availability
Note
Availability states are evaluated "top down." If a resource is down, disabled, or unknown, then all of its children are immediately assumed to be in that state, as well.

Table 18.2. Group Availability States

If the Resource States Are .... ... the Group State Is ...
Empty Group (Unknown) Empty
All Red (Down) Red (Down)
Some Down or Unknown Yellow (Mixed)
Some Orange (Disabled) Orange (Disabled)
All Green (Up) Green (Up)

18.6. Disabling Resources for Maintenance

Disabling a resource essentially removes it from the JBoss ON server's view. There can be a lot of reasons why a resource will be taken offline — a machine could be moved to a new colocation facility, the platform may be upgraded, or there could be hardware changes. When an IT administrator knows that a resource will be unavailable, there is no reason to have an availability check which could trigger white noise of unnecessary reports. The resource can be disabled, which signals to the JBoss ON server that the resource availability is down (or cycling) and should be ignored.
There are two things to remember when disabling a resource:
  • If the agent is still up, then the resource availability is still reported. It is just ignored by the JBoss ON server, and is not included in any availability calculations.
  • Disabling a parent resource automatically disables all of its children, too.
  1. Click the Inventory tab in the top menu.
  2. Select the resource category, such as servers or services, in the Resources menu table on the left. Then browse or search for the resource.
  3. Select the resource in the list.
  4. Click the Disable button at the bottom of the page.
  5. When prompted, confirm that the resource should be disabled.
The disabled resource has an orange icon marking its state.

Figure 18.9. Disabled Resource

Disabled Resource
Note
When the resource is re-enabled, it has an unknown state until the next scheduled availability scan.

18.7. Allowing Plug-ins to Disable and Enable Resources Automatically

Some child or dependent resources may consistently use a disabled state to indicate that the resource is inactive. For example, a managed server in a JBoss EAP 6 domain or a web context under mod_cluster may be offline because it is inactive, and this should be treated differently than being explicitly down. In this case, the parent resource can start or stop the dependent child automatically; when not started, the child is off, but not down.
The resource plug-in itself can automatically disable and enable dependent resources by using the AvailabilityContext.disable() and AvailabilityContext.enable() methods as part of its availability definition in its component JAR files.
Important
Be careful when allowing a resource plug-in to enable or disable a resource automatically. This potentially allows the plug-in to override whatever state the administrator has set.
For more information on writing resource plug-ins, see the Development: Writing Custom Plug-ins.

18.8. Changing the Availability Check Interval

While the availability check is not strictly a metric, it does have a collection schedule that can be edited with the other metric collection schedules.
  1. Click the Inventory tab in the top menu.
  2. Select the resource category, such as servers or services, in the Resources menu table on the left. Then browse or search for the resource.
  3. Click the Monitoring tab on the resource entry.
  4. Click the Schedules subtab.
  5. Select the availability metric, and enter the desired collection period in the Collection Interval field, with the appropriate time unit (seconds, minutes, or hours).
    Note
    Availability schedules can be set on compatible groups or resource type templates. Setting it at the group or resource type level changes multiple resources simultaneously.
  6. Click Set.

18.9. Changing the Agent's Availability Scan Period

Since availability is processed on the server, large environments with hundreds of agents and tens of thousands of resources can stress the server and hurt performance. In that case, the default scan interval may be too short, and setting a longer scan interval may improve JBoss ON server performance.
Note
When changing core agent or server settings, especially ones that impact JBoss ON performance, contact Red Hat Support Services for assistance.
  1. Open the agent configuration file.
    vim agentRoot/rhq-agent/conf/agent-configuration.xml
  2. Uncomment the lines in the XML file, and set the new scan time (in seconds).
    <entry key="rhq.agent.plugins.availability-scan.period-secs" value="60"/>
  3. Restart the agent in the foreground of a terminal. Use the --cleanconfig option to force the agent to read the new configuration from the configuration file.
    agentRoot/rhq-agent/bin/rhq-agent.sh --cleanconfig

Chapter 19. Metrics and Measurements

Every operating system, application, and server has some mechanism for gaging its performance. A database has page hits and misses, servers have open connection counts, platforms have memory and CPU usage. These performance measurements can be monitored by JBoss Operations Network as metrics.

19.1. Direct Information about Resources

Metrics are a way of measuring a resource's performance or a way of measuring its load. The key word is measurement. A metric is some data point which software exposes, which is relevant to the operations or purpose of that software, that provides insight into the quantifiable behavior of that software.

Figure 19.1. Metric Graph

Metric Graph
Every type of resource has its own set of metrics, relevant to the resource type. Metrics are defined in the plug-in descriptor for that resource type. The plug-in descriptor lists the types of measurements which are possible and allowed for that resource; that's not necessarily the same thing as the metrics which are actually collected for a resource. Metrics themselves must be enabled (per resource or per metric template) and are then collected on schedule.

19.1.1. Raw Metrics, Displayed Metrics, and Storing Data

The most recent (and unprocessed) reading of metric information is raw data. This raw data is stored in the backend server, but it is not the information that is displayed in the web UI.
The information displayed in the web UI is aggregated data. In other words, the metrics displayed in JBoss ON and used for monitoring charts are calculated values, not raw data points. Once every hour, a job is run that compresses these metric values into one hour aggregates. These aggregates contain the minimum, maximum, and average value of the measured data for the aggregate period. Aggregates are also made for 6-hour and 24-hour windows.
These aggregates are then used to calculate the data displayed in the UI, according to the range of the graph and the size of the display space. The web UI has a limited display space, segmented into 60 x-axis segments. The JBoss ON server averages the raw data to create the data points for whatever the display time period is. For example, if the display range is 60 hours, each x-axis segment is 1-hour wide, and that data point is an average of all readings collected in that 1-hour segment. This aggregation is dynamic, depending on the monitoring window given in the chart views.
As Section 19.1.4, “Baselines and Out-of-Bounds Metrics” describes, the baseline calculations themselves are aggregates of the raw data, with 1-hour, 6-hour, and 24-hour windows to set minimum, maximum, and average baselines. Unlike the UI aggregates, these aggregated data are calculated and then stored as monitoring data in the server database.
Raw data are only stored for one week, by default, while aggregated values are stored for up to a year. The data storage times are configurable.

19.1.2. Current Values

As Section 19.1.1, “Raw Metrics, Displayed Metrics, and Storing Data” describes, most of the information displayed in JBoss ON is aggregated data. It is the cumulative result of multiple data points gathered over a monitoring period, and then processed and displayed within the given chart.
While JBoss ON is not a real-time monitor, it is continually gathering data. The last collected value is displayed on the Monitoring tab of a resource shows the last, raw value for the given metric.

Figure 19.2. Live Values Column

Live Values Column
As long as the Monitoring tab is open, the metrics are collected and the live values (and other averaged data) are updated along with the web UI refresh setting, rather than their configured collection schedule. (Availability is checked even more frequently than the refresh schedule, every 15 seconds.) This means that, when viewing the metrics for a resource, the most recent information is always gathered and displayed, and that information is updated as quickly as every minute.
Note
This altered schedule for collecting metrics could impact dampening rules for any alerts for a resource.
For example, if a metric is scheduled to be collected every 10 minutes with dampening set to fire an alert after three (3) occurrances of a certain condition, and the refresh interval of the UI is one minute, then an alert could fire after three minutes if a certain condition is read — even though the intent of the alert is to fire only after half an hour of the condition persisting.

19.1.3. Counting Metrics: Dynamic Values and Trend Values

It may seem obvious, but understanding metrics data includes understanding how the data are counted. There are two types of counted values:
  • Dynamic values show a momentary and changeable value, a current state. This includes things like the current number of connections to an application server or the CPU usage on a platform.
  • Trend values are cumulative counts, totals since the resource was started or over its lifetime. These values only progress in a single direction (usually, but not always, higher)
For example, there are two similar metrics for the agent's measurement subsystem: metrics collected and metrics collected per minute. The latter is a dynamic metric, meaning that its value goes up and down depending on whatever number of metrics has actually been collected in the last minute. Metrics collected (the first metric) is a cumulative number; it is the total number of metrics collected by the agent, since it started. So, these two metrics have very different values, despite counting the same data.
As Figure 19.3, “Dynamic and Trend Values for Metrics” shows, it is possible to calculate an average for trend data, but that value is meaningless. Likewise, the "minimum" for a trend value is the starting value of the selected time period, while the "maximum" is the last value for the selected time period. Other automatic calculations — such as out-of-bound values and baselines — are also meaningless with trend data, but are valuable with dynamic data.

19.1.4. Baselines and Out-of-Bounds Metrics

After metrics have been collected for a reliable amount of time, JBoss ON automatically calculates a baseline for the metric. A baseline is the normal operating range for that metric on that resource. The baseline is caluclated, by default, every three days using the aggregated data. The baseline uses a rolling window of seven days' of data.
Baseline metrics compare changes in actual data against a baseline value. Baselines allow effective trending analysis, SLAs management, and overall application health assessments as a form of fault management.
Baselines allow JBoss ON to identify metric values collected that fall outside (out-of-bounds) of the high and low baselines. Out-of-bounds metrics are reported as problem metrics.
Note
When an alert is triggered in response to a metric value, the alerting event is tracked as a problem metric.
If there are no baselines present, because they have not yet been computed or because the metric is a trends metric (meaning it is a cumulative value), no out-of-bounds factors will be calculated.
A baseline has a bandwidth that is the difference between its minimum and maximum values. The difference is the absolute amount that the problem metric is outside the baseline. To be able to compare out-of-bound values, an out-of-bounds-factor is computed by dividing the difference by the bandwidth. This creates a ratio to show comparatively how far out of the normal operation range the problem metric is.
In the Suspect Metrics report, the baselines are reported as minimum,maximum, and then the out-of-bounds metric is listed as the outlier. The difference between the baselines and the outliers is shown as a percentage: the difference of the outlier to its nearest baseline, divided by the baseline bandwidth.

Figure 19.4. Suspect Metrics Reports

Suspect Metrics Reports
Note
Calculating baselines can sometimes output non-intuitive results, as a band of (1,2) and an outlier value of 3 seems to be less than a band of (100, 200 MB) and an outlier value of 250 MB. The former is actually 100% outside the expected band, while the latter is only 50% outside.

Figure 19.5. Out-of-Bound Factors

Out-of-Bound Factors
Out-of-bounds-factors are recalculated each hour during a calculation job. The job assesses the aggregate and determines if there is a more severe outlier than before. The chart always displays the most severe outlier.
When the baselines for a metric change, all recorded out-of-bounds values become invalid and are removed because the out-of-bounds measurement was computed against an old baseline.

19.1.5. Collection Schedules

The metric collection schedule is defined individually for each metric in the resource type's plug-in descriptor.
There is no rule on how frequently metrics are collected. Default intervals range between 10 minutes and 40 minutes for most metrics. While some metrics are commonly important (like free memory or CPU usage on platforms), the importance of many metrics depends on the general IT and production environments and the resource itself. Set reasonable intervals to collect important metrics with a frequency that adequately reflects the resource's real life performance.
The shortest configurable interval is 30 seconds, although an interval that short should be used sparingly because the volume of metrics reported could impact database performance.

19.1.6. Metric Schedules and Resource Type Templates

Unlike other types of monitoring data which are unique to an resource (availability, events, traits), metrics can be universal for all resources of that type.
Metric collection schedules define whether an allowed metric for a resource is actually enabled and what its collection interval is. A schedule is set at the resource-level, but administrator-defined default settings can be applied to all resources of a type by using metrics collection templates.
Templates are a server configuration setting. They define what metrics are active and what the collection schedules are for all resources of a specific type. When templates are used, they supplant whatever default metrics settings are given in the plug-in descriptor. (A metric template only defines whether a metric is enabled and what its interval is — the plug-in descriptor alone defines what metrics are available for a resource type.)
These settings can be overridden at the resource-level, as necessary. Still, metrics collection templates provide a simple way to apply metrics settings consistently across resources and machines.

19.2. Viewing Metrics and Baseline Charts

The core of monitoring is the metric information that is collected for a resource. Each resource has different metrics (and these are listed in the Resource Reference: Monitoring, Operation, and Configuration Options). Three monitoring charts show the same information, but in different perspectives and different levels of detail:
  • The resource-level Summary
  • Graphs
  • Tables
The Summary tab for resources, much like the Dashboard for the entire JBoss ON inventory, has portlets that show different resource information. Most resources have three portlets for measurements, events, and out-of-bound metrics. The Measurements portlet has small thumbnail charts that show the trend for the metric, along with the current reading.
Clicking any of the metrics will open the baseline chart for that metric. As is described in Section 19.1.4, “Baselines and Out-of-Bounds Metrics”, baselines calculate an average reading for a given period of time, with the high and low measurements in that period creating upper and lower bounds. Baselines, by default, are calculated every three days using the data from the previous seven days for the calculation. Baseline measurements are essential for establishing operating norms so that administrators can effectively set alerts for resources.

Figure 19.6. Individual Metric Graph

Individual Metric Graph
The Metrics area in the Monitoring tab shows all of the metrics in a table, with columns for the high, low, and current readings. There is also a column which shows the number of active alerts for each metric.

Figure 19.7. Metrics Table

Metrics Table
Expanding a metric opens its individual metric graph, giving the trend for the past eight hours. This provides more granular detail than the summary or baselines charts, showing the readings for each collection period and the precise readings.

Figure 19.8. Metric Graph within the Table

Metric Graph within the Table

19.3. Defining Metrics Collection

19.3.1. Setting Baseline Calculation Properties

The monitoring baselines have two configuration properties that define how the automatic metric baselines are calculated. These properties don't set the value; they set the window of time used for the baseline averages.
  1. In the Configuration menu, select the System Settings item.
  2. Scroll to the Automatic Baseline Configuration Properties section.
  3. Change the settings to define the window used for calculation.
    • Baseline Frequency sets the interval, in days, for how often baselines are recalculated. The default is three days.
    • Baseline Dataset sets the time interval, in days, used to calculate the baseline. The default is seven days.

19.3.2. Setting Collection Intervals for a Specific Resource

Metrics are collected at the intervals specified by the collection schedule. Because not all metrics are mission critical or even likely to change, JBoss ON has different collection schedules for different metrics, with critical metrics collected more frequently.
For most environments, setting a daily collection schedule (once every 24 hours) is sufficient.
To change the collection interval for a specific metric:
  1. Click the Inventory tab in the top menu.
  2. Select the resource category, such as servers or services, in the Resources menu table on the left. Then browse or search for the resource.
  3. Click the Monitoring tab on the resource entry.
  4. Click the Schedules subtab.
  5. Select the metric for which to change the monitoring frequency. Multiple metrics can be selected, if they will all be changed to the same frequency.
  6. Enter the desired collection period in the Collection Interval field, with the appropriate time unit (seconds, minutes, or hours).
  7. Click Set.

19.3.3. Enabling and Disabling Metrics for a Specific Resource

  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Monitoring tab on the resource entry.
  4. Click the Schedules sub tab.
  5. Select the metrics to enable or disable.
  6. Click the Enable or Disable button.

19.3.4. Changing Metrics Templates

The metrics which are collected for a resource type are defined in the monitoring template for the resource type. Each resource type has some metrics disabled by default, and these must be manually enabled. Likewise, metrics which are enabled by default can be disabled.
Note
Metric templates only apply to new resources of that resource type unless the checkbox is selected to apply them to existing resources as well as new resources.
  1. In the top navigation, open the Administration menu, and then the Configuration menu.
  2. Select the Metric Collection Templates menu item. This opens a long list of resource types, both for platforms and server types.
  3. Locate the type of resource for which to create the template definition.
  4. Click the pencil icon to edit the metric collection schedule templates.
  5. Select the required metrics to enable or disable, and click the Enable or Disable button.
  6. To edit the frequency that a metric is collected, select the Update schedules for existing resources of marked type checkbox, and then enter the desired time frame into the Collection Interval for Selected: field.
  7. Click the Set button.

19.3.5. Adding a PostgreSQL Query as a Metric

A SQL query can be added to a PostgreSQL database as a child resource. That entry becomes a custom metric for that PostgreSQL database.
A query metric must have two columns that allow the JBoss ON agent to collect data for the query:
  • metricColumn
  • count(id)
The query has to return a single row with those two columns. The first column signals that it is a collected metric, and the second gives the count for the metric.
For example, to track logged-in users:
SELECT 'metricColumn', count(id) FROM my_application_user WHERE is_logged_in = true
The SELECT statement defines the metric for the JBoss ON agent. The rest of the query collects the data from the database. Simple as that.
To add a metric based on a query:
  1. Click the Inventory tab in the top menu.
  2. Search for the PostgreSQL resource.
  3. Click the Inventory tab for the PostgreSQL database.
  4. Click the Import button in the bottom of the Inventory tab, and select Query.
  5. Fill in the properties for the query metric. Three fields are particularly important:
    • The Table gives which table within the database contains the data; this is whatever is in the FROM statement in the query.
    • The Metric Query contains the full query to run. The SELECT statement must be 'metricColumn',count(id) to format the query properly for the JBoss ON agent to interpret it as a metric.
      SELECT 'metricColumn', count(id) FROM my_application_user WHERE is_logged_in = true
    • The Name field is not important in configuring the metric, but it is important identifying the metric later.
Once the query is created, then the agent begins collecting the counts for the data.

Figure 19.9. Query: Total Logged-in User Count

Query: Total Logged-in User Count

Chapter 20. Events

Metric data are collected according to a schedule. However, some actions occur on a resource sporadically, such as sudden system shutdowns. These are events. Since event data can be generated randomly, events are sent to agents immediately when they are detected.

20.1. Events, Logs, and Resources

Operating and some server types keep their own log files and register a steady stream of information about incidents for that resource, from simple debug information to critical errors. Metrics are collected on a (more or less) set schedule. Events are entirely random, because they are based on real actions as they occur. Events, then, give a different perspective on resource performance.
Events monitoring tracks those streaming log file messages. In a sense, JBoss ON event monitoring is a filtered log viewer. When events are enabled, all of the log messages flow through JBoss ON's event viewer. However, the types of messages that JBoss ON records and displays can be limited in the event configuration so that only certain types of messages or messages matching certain strings are included in the events view.
A handful of resource types record and display events:
  • Linux (syslog)
  • Windows (Windows event logs)
  • Apache server (log files)
  • JBoss EAP server (log files)
Note
Events must be enabled and configured for a resource before they become active.
Default events are taken from standard log files. Custom resource types can identify event sources in logs or in asynchronous messaging systems such as a JMX notification or a JMS messaging system.

20.2. Event Date Formatting

JBoss ON expects log messages to follow the log4j log pattern, over all.
date severity [class] message
The date format is configurable when event monitoring is enabled. For custom logs, a custom pattern can be added. If no date format is given, then the three standard formats used by log4j are tried.
YYYY-mm-dd HH:mm:ss,SSS
HH:mm:ss,SSS
dd MM yyyy HH:mm:ss,SSS
The severity must be next. It can be plain, surrounded by brackets, or surrounded by parentheses.
date SEVERITY [org.foo.bar] my message
date [SEVERITY] [org.foo.bar] my message
date ( SEVERITY ) [org.foo.bar] my message
After the severity, there are no real constraints on the format of the log entry. If classes or other identifiers are passed, they are properly displayed.

20.3. Defining a New Event

Events are only recognized by the monitoring service if events logging is properly enabled for the specific service being logged. This requires creating a log event for the log or system service, specifying a log path on the resource, and setting a date format which matches the format for the log.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Inventory tab on the resource entry.
  4. Select the Connection Settings subtab.
  5. Click the plus icon under the Events Log section to add a log instance to monitor.
  6. Enable event log collection.
  7. Set the parameters for the event log collection.
    Depending on the resource being configured, there are slightly different options for the event log configuration. All of the resources all allow different filters to identify which log messages to include:
    • A minimum severity of any error message.
    • A regular expression or pattern to use on log message strings.
    Additionally, the application servers and Linux allow different log locations to be specified. (The Windows resource uses its System Event Log.) Along with accommodating custom log locations, it also allows for other logging services to be used. For Linux, this allows both platform and program logs to be monitored; for application services, this allows logs within a messaging service to be checked.
    As discussed in Section 20.2, “Event Date Formatting”, there are different potential date formats that can be used in logging; if anything other than log4j is used, the pattern can be specified so that the agent can read the log entries.

    Figure 20.1. EAP Event Log Configuration

    EAP Event Log Configuration
    Unlike either the application servers or Windows, Linux systems can log events in a system file or in a listener. If an rsyslog server or local syslog listener is configured, then it is possible to select a listener (rather than a local file) and to add the listener port and bind address for a remote server.

    Figure 20.2. Linux Event Log Configuration

    Linux Event Log Configuration

20.4. Viewing Events

  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Events tab on the resource entry. Events can be filtered by severity (debug, info, warn, error, and fatal).
  4. Click the specific event for further details.
Note
Recent events can also be monitored directly from the Dashboard by using the "Recent Events" portlet. For more information about adding portlets to the dashboard, refer to the Section 1.7.2, “Adding and Editing Dashboards” section of this document.

20.5. Detailed Discussion: Event Correlation

There are a lot of moving pieces in IT infrastructure, and one change can cause a cascade of results. One factor in being able to respond to events and manage infrastructure is the ability to correlate an event with its cause.
While there is not an exact way to pinpoint what change caused what event or alert, there is a way to help visualize how unrelated JBoss ON elements — alerts, configuration changes, event logs, drift detection, or inventory changes — might be related. Each resource has a Timeline area on its Summary tab. This collects all of the major operations that have occurred on that resource, either through JBoss ON or as detected by JBoss ON.
The thing to look for is clusters. For example, in this case, there was a configuration change on the JBoss server, and shortly after there was a critical alert and a simultaneous error in the JBoss server's logs. That is a reasonable indication that the configuration change caused a performance problem.

Figure 20.3. Resource Timeline Cluster

Resource Timeline Cluster

Chapter 21. URL Response Time Monitoring

Part of web application performance is determined by how responsive it is to requests. JBoss Operations Network supplies an extra monitoring setting called response time filters which measures the amount of time it takes for a URL to respond to a request.

21.1. Call-Time (or Response Time) Monitoring for URLs

Performance for web applications is more complex than simple availability. How quickly an application can respond to requests is as important as whether the application is running.
The time that it takes an application to respond to some kind of request is call-time or response time data.
Two types of resources support call-time data by default:
  • Session beans, for EJB method calls.
  • Web servers (standalone or embedded in an application server), for URL responses. Web servers require an additional response time filter with configuration on what URL resources to measure for response times.
Response time monitoring is more based upon performance than it is on capturing a single data point. Response or call-time data are displayed as aggregates, showing maximum, minimum, and average response times per URL or per method.

21.2. Viewing Call Time Metrics

Both session bean resources and web server resources have an additional Monitoring subtab called Calltime. All of the call-time data or response time data ranges (minimum, maximum, and averages) are displayed for each URL resource or method. As new URLs or methods are accessed, they are dynamically added to the results table.

Figure 21.1. URL Metrics for a Web Server

URL Metrics for a Web Server

21.3. Extended Example: Website Performance

The Setup

A significant amount of Example Co.'s business, services, and support is tied to its website. Customers have to be able to access the site to purchase products, schedule training or consulting, and to receive most support and help. If the site is slow or if some resources are inaccessible, customers immediately have a negative experience.

The goal is not to monitor whether the web server is running, but whether the web application are responsive and performing as Example Co.'s customers expect.

What to Do

Tim the IT Guy identifies three different ways that he can capture web application performance information:

  • Response times for individual URLs
  • Throughput information like total number of requests and responses
  • Counts for critical HTTP response codes
Both monitoring and alerting can be configured based solely off response time and throughput metrics. However, bad website performance is indicative of an underlying problem with the web server or its associated database. Therefore, Tim not only wants to be informed when website performance it poor; he wants to correlate some performance metrics with underlying server and database performance and launch operations that can mitigate poor responsiveness.
Tim maps a few common scenarios which cause poor website or web server performance and plans simple, immediate operations that JBoss ON can perform until an IT staffer can analyze the problem. Tim attempts to narrow down potential causes to a performance. Alerts can be issued for a single condition or for a combination of conditions. In Tim's case, he creates a three different alerts based on different combinations of underlying causes for performance problems (Section 25.1.2, “Basic Procedure for Setting Alerts for a Resource”).
  • If there are poor response times and a high number of HTTP error 500 responses, then the alert can be configured with an operation to restart the web server (Section 25.3.2, “Detailed Discussion: Initiating an Operation”).
  • If there are poor response times and a high number of HTTP error 404 response (meaning that resources may not be delivered properly), then the alert is configured to restart the database.
  • If there are poor response times and a high number of total requests per minute, then it may mean that there is simply too much load on the server. The alert can be configured to create another web server instance to help with load balancing; using a JBoss ON CLI script allows the JBoss ON server to create new resources as necessary and deploy bundles of the appropriate web apps (Section 25.3.3, “Detailed Discussion: Initiating Resource Scripts”).
The most critical factor is the response time, which is a factor in every alert. Each alert has one condition based on the call time data, specifically of the call-time data moves past a certain threshold.
Tim picks a reasonable threshold, about 15 seconds, for performance. If performance degrades so that the HTTP Response Time metric returns a value higher than 20 seconds to load pages, JBoss ON issues an alert.
Alternatively, he could alert on simple call-time changes. Call-time changes will trigger an alert for any change from the established baseline, meaning a new minimum, maximum, or average value. A change of any kind can alert in either a decrease in performance or an increase in performance. A threshold alert only alerts on a specific change.
Tim then adds the other condition, with an AND operator, to each alert he configures.
Also, most web app-related metrics are not enabled by default. Tim enables the Total Number of Requests per Minute, Total Number of Responses per Minute, Number of 404 Responses per Minute, and Number of 500 Responses per Minute metrics for each web server (Section 19.3.4, “Changing Metrics Templates”).
For every alert, Tim also configures an email notification along with the other responses, so that a member of the IT staff can evaluate any website performance problems and take additional actions if necessary.

21.4. Configuring EJB Call-Time Metrics

EJB method call-time measurements are not collected by default.
  1. Click the Inventory tab in the top menu.
  2. Select the Services menu table on the left, and then navigate to the EJB resource.
    Note
    It is probably easier to search for the session bean by name, if you know it.
  3. Click the Monitoring tab on the EJB resource entry.
  4. Click the Schedules subtab.
  5. Select the Method Execution Time metric. This metric is the calltime type.
  6. Click the Enable at the bottom of the list.

21.5. Configuring Response Time Metrics for JBoss EAP

JBoss ON can monitor application performance by testing how responsive deployed applications are to client requests. Specifically, JBoss ON can determine how quickly an application responds to requests. This is called a response time measurement.
To collect response time metrics for a JBoss EAP server, first install the servlet JAR for the response time filter and configure the web server to use it. Then, enable the metrics collection for the web resource.

21.5.1. Installing the Response Time Filters for JBoss EAP 6

Note
To collect response time metrics, the EAP 6 agent plug-in must be configured to connect to the server http(s) management interface.

How To Install the Response Time Filters in JBoss EAP

Follow this procedure to correctly install the Response Time Filters on JBoss EAP 6

Prerequisites

Procedure 21.1. 

  1. Download the response time packages for JBoss from the JBoss ON UI. The response time filters are packaged as AS 7 modules. There are two modules to obtain:
    rhq-rtfilter-module.zip
    rhq-rtfilter-subsystem-module.zip
    Note
    This can also be done from the command line using wget:
    [root@server ~]# wget http://server.example.com:7080/downloads/connectors/rhq-rtfilter-module.zip
    [root@server ~]# wget http://server.example.com:7080/downloads/connectors/rhq-rtfilter-subsystem-module.zip
    1. Click the Administration tab in the top menu.
    2. In the Configuration menu box on the left, select the Downloads item.
    3. Click the rhq-rtfilter-module.zip and rhq-rtfilter-subsystem-module.zip links, and save the files to an accessible directory, like the /tmp directory.
  2. Open the modules/ directory for the JBoss EAP 6 instance. For example:
    [root@server ~]# cd /opt/jboss-eap-6.0/modules/
  3. Unzip the rhq-rtfilter-module.zip archive to install the response time filter JAR and the associated module.xml file.
    [root@server modules]# unzip /tmp/rhq-rtfilter-module.zip
  4. Open the configuration file for the server, domain.xml or standalone.xml.
  5. Deploy the response time module globally by adding the module to the list of global modules in the <subsystem> element. Save the file once complete.
    <profile...
    	<subsystem xmlns="urn:jboss:domain:ee:1.1">
    		<global-modules>
    		    <module name="org.rhq.helpers.rhq-rtfilter" slot="main"/>
    		</global-modules>
    	</subsystem>
    </profile>
  6. Unzip the rhq-rtfilter-subsystem-module.zip archive to install the subsystem response time filter JAR and the associated module.xml file.
    [root@server modules]# unzip /tmp/rhq-rtfilter-subsystem-module.zip
    This installs the filters as a subsystem for the application server or individual web apps.
  7. After the filters have been installed, the JBoss EAP 6 server needs to be configured to use them.
    The response time filter can be deployed globally, for all web applications hosted by the EAP/AS instance, or it can be configured for a specific web application.
    To deploy the filter as a global subsystem:
    1. Open the configuration file for the server, domain.xml or standalone.xml.
    2. Add the an <extensions> element for the response time filter.
      <extension module="org.rhq.helpers.rhq-rtfilter-subsystem"/>
  8. Add a <subsystem> element beneath the <profile> element.
    All that is required for response time filtering to work is the default <subsystem> element, without any optional parameters. However, the parameters can be uncommented and set as necessary; the different ones are described in Table 21.1, “Parameters Available for User-Defined <filter> Settings”.
    The <subsystem> element should be added even if none of the optional parameters are set.
    <subsystem xmlns="urn:rhq:rtfilter:1.0">
        <!-- Optional parameters.
    
           <init-param>
               <param-name>chopQueryString</param-name>
               <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>logDirectory</param-name>
              <param-value>/tmp</param-value>
           </init-param>
           <init-param>
               <param-name>logFilePrefix</param-name>
               <param-value>localhost_7080_</param-value>
           </init-param>
           <init-param>
               <param-name>dontLogRegEx</param-name>
               <param-value></param-value>
           </init-param>
           <init-param>
              <param-name>matchOnUriOnly</param-name>
              <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>timeBetweenFlushesInSec</param-name>
               <param-value>73</param-value>
           </init-param>
           <init-param>
               <param-name>flushAfterLines</param-name>
               <param-value>13</param-value>
           </init-param>
           <init-param>
               <param-name>maxLogFileSize</param-name>
               <param-value>5242880</param-value>
           </init-param>
    -->
    </subsystem>

Procedure 21.2. How To Configure Response Time Filters for an Individual Web Application in JBoss EAP 6

  1. Open the web application's web.xml file.
    [root@server ~]# vim WARHomeDir/WEB-INF/web.xml
  2. Add the filter and, depending on the configuration, filter mapping elements to the file. This activates the response time filtering.
    All that is required for response time filtering to work is the default <filter> element, without any optional parameters. However, the parameters can be uncommented and set as necessary; the different ones are described in Table 21.1, “Parameters Available for User-Defined <filter> Settings”.
       <filter>
           <filter-name>RhqRtFilter</filter-name>
           <filter-class>org.rhq.helpers.rtfilter.filter.RtFilter</filter-class>
    
    <!-- Optional parameters.
    
           <init-param>
               <param-name>chopQueryString</param-name>
               <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>logDirectory</param-name>
              <param-value>/tmp</param-value>
           </init-param>
           <init-param>
               <param-name>logFilePrefix</param-name>
               <param-value>localhost_7080_</param-value>
           </init-param>
           <init-param>
               <param-name>dontLogRegEx</param-name>
               <param-value></param-value>
           </init-param>
           <init-param>
              <param-name>matchOnUriOnly</param-name>
              <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>timeBetweenFlushesInSec</param-name>
               <param-value>73</param-value>
           </init-param>
           <init-param>
               <param-name>flushAfterLines</param-name>
               <param-value>13</param-value>
           </init-param>
           <init-param>
               <param-name>maxLogFileSize</param-name>
               <param-value>5242880</param-value>
           </init-param>
    -->
    
       </filter>
    
       <!-- Use this only when also enabling the RhqRtFilter in the filter
       <filter-mapping>
           <filter-name>RhqRtFilter</filter-name>
           <url-pattern>/*</url-pattern>
       </filter-mapping>
       -->
  3. Restart the JBoss EAP server to load the new web.xml settings.

Table 21.1. Parameters Available for User-Defined <filter> Settings

Parameter
Description
chopQueryString
Only the URI part of a query will be logged if this parameter is set to true. Otherwise the whole query line will be logged. Default is true.
logDirectory
The directory where the log files will be written to. Default setting is {jboss.server.log.dir}/rt/ (usually server/xxx/log/rt). If this property is not defined, the fallback is {java.io.tmpdir}/rt/ (/tmp/ on UNIX®, and ~/Application Data/Local Settings/Temp – check the TEMP environment variable) is used. If you specify this init parameter, no directory rt/ will be created, but the directory you have provided will be taken literally.
logFilePrefix
A prefix that is put in front of the log file names. Default is the empty string.
dontLogRegEx
A regular expression that is applied to query strings. See java.util.regex.Pattern. If the parameter is not given or an empty string, no pattern is applied.
matchOnUriOnly
Should the dontLogRegEx be applied to the URI part of the query (true) or to the whole query string (false). Default is true.
timeBetweenFlushesInSec
Log lines are buffered by default. When the given number of seconds have passed and a new request is received, the buffered lines will be flushed to disk even if the number of lines to flush after (see next point) is not yet reached.. Default value is 60 seconds (1 Minute).
flushAfterLines
Log lines are buffered by default. When the given number of lines have been buffered, they are flushed to disk. Default value is 10 lines.
maxLogFileSize
The maximum allowed size, in bytes, of the log files; if a log file exceeds this limit, the filter will truncate it; the default value is 5242880 (5 MB).
vHostMappingFile
This properties file must exist on the Tomcat process classpath. For example, in the ../conf/vhost-mappings.properties. The file contains mappings from the 'incoming' vhost (server name) to the vhost that should be used as the prefix in the response time log file name. If no mapping is present (no file or no entry response times are set), then the incoming vhost (server name) is used. For example:
pickeldi.users.acme.com=pickeldi
pickeldi=
%HOST%=
The first mapping states that if the incoming vhost is 'host1.users.acme.com', then the log file name should get a vhost of 'host1' as prefix, separated by a _ from the context root portion. The second mapping states that if the 'incoming' vhost is 'host1', then no prefix, and no _, should be used. The third mapping uses a special left-hand-side token, '%HOST%'. This mapping states that if the 'incoming' vhost is a representation of localhost then no prefix, and no _ , should be used.
%HOST% will match the host name, or canonical host name or IP address, as returned by the implementation of InetAddress.getLocalHost().
The second and third mappings are examples of empty right hand side, but could just as well have provided a vhost.
This is a one time replacement. There is no recursion in the form that the result of the first line would then be applied to the second one.

21.5.2. Installing the Response Time Filters for JBoss EAP 7

Note
To collect response time metrics, the JBoss EAP 7 agent plug-in must be configured to connect to the server http(s) management interface.

How To Install the Response Time Filters in JBoss EAP

Follow this procedure to correctly install the Response Time Filters on JBoss EAP 7

Prerequisites

Procedure 21.3. 

  1. Download the response time packages for JBoss from the JBoss ON UI. The response time filters are packaged as AS 7 modules. There are two modules to obtain, both of which are contained in one zip archive:
    rhq-rtfilter-wfly-10-dist.zip
    Note
    This can also be done from the command line using wget:
    [root@server ~]# wget http://server.example.com:7080/downloads/connectors/rhq-rtfilter-wfly-10-dist.zip
    1. Click the Administration tab in the top menu.
    2. In the Configuration menu box on the left, select the Downloads item.
    3. Click the rhq-rtfilter-wfly-10-dist.zip link, and save the files to an accessible directory, like the /tmp directory.
  2. Open the modules/ directory for the JBoss EAP 7 instance. For example:
    [root@server ~]# cd /opt/jboss-eap-7.0/modules/
  3. Unzip the rhq-rtfilter-wfly-10-dist.zip archive to install the JARs and the associated module.xml files.
    [root@server modules]# unzip /tmp/rhq-rtfilter-wfly-10-dist.zip
  4. Open the configuration file for the server, domain.xml or standalone.xml.
  5. Deploy the response time module globally by adding the module to the list of global modules in the <subsystem> element. Note that version of domain:ee subsystem differs between JBoss EAP versions. Save the file once complete.
    <profile...
    	<subsystem xmlns="urn:jboss:domain:ee:1.1">
    		<global-modules>
    		    <module name="org.rhq.helpers.rhq-rtfilter" slot="main"/>
    		</global-modules>
    	</subsystem>
    </profile>
  6. After the filters have been installed, the JBoss EAP 7 server needs to be configured to use them.
    The response time filter can be deployed globally, for all web applications hosted by the JBoss EAP instance, or it can be configured for a specific web application.
    To deploy the filter as a global subsystem:
    1. Open the configuration file for the server, domain.xml or standalone.xml.
    2. Add the an <extensions> element for the response time filter.
      <extension module="org.rhq.helpers.rhq-rtfilter-wfly-10-subsystem"/>
  7. Add a <subsystem> element beneath the <profile> element.
    All that is required for response time filtering to work is the default <subsystem> element, without any optional parameters. However, the parameters can be uncommented and set as necessary; the different ones are described in Table 21.2, “Parameters Available for User-Defined <filter> Settings”.
    The <subsystem> element should be added even if none of the optional parameters are set.
    <subsystem xmlns="urn:rhq:rtfilter:1.0">
        <!-- Optional parameters.
    
           <init-param>
               <param-name>chopQueryString</param-name>
               <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>logDirectory</param-name>
              <param-value>/tmp</param-value>
           </init-param>
           <init-param>
               <param-name>logFilePrefix</param-name>
               <param-value>localhost_7080_</param-value>
           </init-param>
           <init-param>
               <param-name>dontLogRegEx</param-name>
               <param-value></param-value>
           </init-param>
           <init-param>
              <param-name>matchOnUriOnly</param-name>
              <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>timeBetweenFlushesInSec</param-name>
               <param-value>73</param-value>
           </init-param>
           <init-param>
               <param-name>flushAfterLines</param-name>
               <param-value>13</param-value>
           </init-param>
           <init-param>
               <param-name>maxLogFileSize</param-name>
               <param-value>5242880</param-value>
           </init-param>
    -->
    </subsystem>

Procedure 21.4. How To Configure Response Time Filters for an Individual Web Application in JBoss EAP 7

  1. Open the web application's web.xml file.
    [root@server ~]# vim WARHomeDir/WEB-INF/web.xml
  2. Add the filter and, depending on the configuration, filter mapping elements to the file. This activates the response time filtering.
    All that is required for response time filtering to work is the default <filter> element, without any optional parameters. However, the parameters can be uncommented and set as necessary; the different ones are described in Table 21.2, “Parameters Available for User-Defined <filter> Settings”.
       <filter>
           <filter-name>RhqRtFilter</filter-name>
           <filter-class>org.rhq.helpers.rtfilter.filter.RtFilter</filter-class>
    
    <!-- Optional parameters.
    
           <init-param>
               <param-name>chopQueryString</param-name>
               <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>logDirectory</param-name>
              <param-value>/tmp</param-value>
           </init-param>
           <init-param>
               <param-name>logFilePrefix</param-name>
               <param-value>localhost_7080_</param-value>
           </init-param>
           <init-param>
               <param-name>dontLogRegEx</param-name>
               <param-value></param-value>
           </init-param>
           <init-param>
              <param-name>matchOnUriOnly</param-name>
              <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>timeBetweenFlushesInSec</param-name>
               <param-value>73</param-value>
           </init-param>
           <init-param>
               <param-name>flushAfterLines</param-name>
               <param-value>13</param-value>
           </init-param>
           <init-param>
               <param-name>maxLogFileSize</param-name>
               <param-value>5242880</param-value>
           </init-param>
    -->
    
       </filter>
    
       <!-- Use this only when also enabling the RhqRtFilter in the filter
       <filter-mapping>
           <filter-name>RhqRtFilter</filter-name>
           <url-pattern>/*</url-pattern>
       </filter-mapping>
       -->
  3. Restart the JBoss EAP server to load the new web.xml settings.

Table 21.2. Parameters Available for User-Defined <filter> Settings

Parameter
Description
chopQueryString
Only the URI part of a query will be logged if this parameter is set to true. Otherwise the whole query line will be logged. Default is true.
logDirectory
The directory where the log files will be written to. Default setting is {jboss.server.log.dir}/rt/ (usually server/xxx/log/rt). If this property is not defined, the fallback is {java.io.tmpdir}/rt/ (/tmp/ on UNIX®, and ~/Application Data/Local Settings/Temp – check the TEMP environment variable) is used. If you specify this init parameter, no directory rt/ will be created, but the directory you have provided will be taken literally.
logFilePrefix
A prefix that is put in front of the log file names. Default is the empty string.
dontLogRegEx
A regular expression that is applied to query strings. See java.util.regex.Pattern. If the parameter is not given or an empty string, no pattern is applied.
matchOnUriOnly
Should the dontLogRegEx be applied to the URI part of the query (true) or to the whole query string (false). Default is true.
timeBetweenFlushesInSec
Log lines are buffered by default. When the given number of seconds have passed and a new request is received, the buffered lines will be flushed to disk even if the number of lines to flush after (see next point) is not yet reached.. Default value is 60 seconds (1 Minute).
flushAfterLines
Log lines are buffered by default. When the given number of lines have been buffered, they are flushed to disk. Default value is 10 lines.
maxLogFileSize
The maximum allowed size, in bytes, of the log files; if a log file exceeds this limit, the filter will truncate it; the default value is 5242880 (5 MB).
vHostMappingFile
This properties file must exist on the Tomcat process classpath. For example, in the ../conf/vhost-mappings.properties. The file contains mappings from the 'incoming' vhost (server name) to the vhost that should be used as the prefix in the response time log file name. If no mapping is present (no file or no entry response times are set), then the incoming vhost (server name) is used. For example:
pickeldi.users.acme.com=pickeldi
pickeldi=
%HOST%=
The first mapping states that if the incoming vhost is 'host1.users.acme.com', then the log file name should get a vhost of 'host1' as prefix, separated by a _ from the context root portion. The second mapping states that if the 'incoming' vhost is 'host1', then no prefix, and no _, should be used. The third mapping uses a special left-hand-side token, '%HOST%'. This mapping states that if the 'incoming' vhost is a representation of localhost then no prefix, and no _ , should be used.
%HOST% will match the host name, or canonical host name or IP address, as returned by the implementation of InetAddress.getLocalHost().
The second and third mappings are examples of empty right hand side, but could just as well have provided a vhost.
This is a one time replacement. There is no recursion in the form that the result of the first line would then be applied to the second one.

21.5.3. Enabling the Call-Time Metric

Response time metrics are configured on a live application deployment. A deployment resource is a child of either a standalone EAP 6 server or of a server group.

Figure 21.2. Web Runtime Resource

Web Runtime Resource
  1. Click the Inventory tab in the top menu.
  2. Click the Servers - Top Level Imports item, and select the JBoss EAP 6 resource.
  3. Navigate to the deployment resource, and expand the application to the web subsystem.
  4. Click the Monitoring tab on the web resource entry.
  5. Click the Schedules subtab.
  6. Select the Response Time metric. This metric is the calltime type.
  7. Click the Enable at the bottom of the list.
  8. Click the Inventory tab on the web entry.
  9. Select the Connection Settings subtab.
  10. Unset the check boxes for the response time configuration and fill in the appropriate values for the web application.
    • The response times log which is used by that specific web application. The log file is a required setting for call-time data collection to work..
    • Any files, elements, or pages to exclude from response time measurements. The response times log records times for all resources the web server serves, including support files like CSS files and icons or background images.
    • The same page can be accessed with different parameters passed along in the URL. The Response Time Url Transforms field provides a regular expression that can be used to strip or substitute the passed parameters.

21.6. Setting up Response Time Monitoring for Apache, EWS/Tomcat, and JBoss EAP 5

Response time monitoring is not available from application or web servers by default; a special module must be installed which allows JBoss ON to collect response time metrics.
After the module is installed, then the HTTP metrics can be enabled for the resource.

21.6.1. Parameters for User-Defined <filter>s

Administrators can set rules for how response time metrics are collected for application and web servers. This are response time filters.
For response time monitoring to work, JBoss, Apache, and EWS/Tomcat all must have response time filters installed. Additional configuration options can be added to those default filters to customize the monitoring for the resource.

Table 21.3. Parameters for User-Defined <filter>s

Parameter
Description
chopQueryString
Only the URI part of a query will be logged if this parameter is set to true. Otherwise the whole query line will be logged. Default is true.
logDirectory
The directory where the log files will be written to. Default setting is {jboss.server.log.dir}/rt/ (usually server/xxx/log/rt). If this property is not defined, the fallback is {java.io.tmpdir}/rt/ (/tmp/ on UNIX®, and ~/Application Data/Local Settings/Temp – check the TEMP environment variable) is used. If you specify this init parameter, no directory rt/ will be created, but the directory you have provided will be taken literally.
logFilePrefix
A prefix that is put in front of the log file names. Default is the empty string.
dontLogRegEx
A regular expression that is applied to query strings. See java.util.regex.Pattern. If the parameter is not given or an empty string, no pattern is applied.
matchOnUriOnly
Should the dontLogRegEx be applied to the URI part of the query (true) or to the whole query string (false). Default is true.
timeBetweenFlushesInSec
Log lines are buffered by default. When the given number of seconds have passed and a new request is received, the buffered lines will be flushed to disk even if the number of lines to flush after (see next point) is not yet reached.. Default value is 60 seconds (1 Minute).
flushAfterLines
Log lines are buffered by default. When the given number of lines have been buffered, they are flushed to disk. Default value is 10 lines.
maxLogFileSize
The maximum allowed size, in bytes, of the log files; if a log file exceeds this limit, the filter will truncate it; the default value is 5242880 (5 MB).
vHostMappingFile
This properties file must exist in the Tomcat process classpath. For example, in the conf/vhost-mappings.properties file. The file contains mappings from the 'incoming' vhost (server name) to the vhost that should be used as the prefix in the response time log file name. If no mapping is present (no file or no entry response times are set), then the incoming vhost (server name) is used. For example:
pickeldi.users.acme.com=pickeldi
pickeldi=
%HOST%=
The first mapping states that if the incoming vhost is 'host1.users.acme.com', then the log file name should get a vhost of 'host1' as prefix, separated by a _ from the context root portion. The second mapping states that if the 'incoming' vhost is 'host1', then no prefix, and no _, should be used. The third mapping uses a special left-hand-side token, '%HOST%'. This mapping states that if the 'incoming' vhost is a representation of localhost then no prefix, and no _ , should be used.
%HOST% will match the host name, or canonical host name or IP address, as returned by the implementation of InetAddress.getLocalHost().
The second and third mappings are examples of empty right hand side, but could just as well have provided a vhost.
This is a one time replacement. There is no recursion in the form that the result of the first line would then be applied to the second one.

21.6.2. Installing Response Time Filters for JBoss EAP/AS 5

To collect response time metrics for a JBoss EAP/AS 5 server, first install the servlet JAR for the response time filter and configure the web server to use it.
  1. Download the Response Time packages for JBoss from the JBoss ON UI.
    Note
    This can also be done from the command line using wget:
    [root@server ~]# wget http://server.example.com:7080/downloads/connectors/connector-rtfilter.zip
    1. Click the Administration tab in the top menu.
    2. In the Configuration menu box on the left, select the Downloads item.
    3. Click the connector-rtfilter.zip link, and save the file.
  2. Unzip the connectors.
    [root@server ~]# unzip connector-rtfilter.zip
  3. Copy the rhq-rtfilter-version.jar file into the lib/ directory for the profile.
    [root@server ~]# cp connector-rtfilter/rhq-rtfilter-version.jar JbossHomeDir/server/profileName/lib/
    JBoss EAP/AS already includes the commons-logging.jar file, which is also required for response time filtering.
  4. Then, configure the web.xml for the EAP/AS instance.
    The response time filter can be deployed globally, for all web applications hosted by the EAP/AS instance or it can be configured for a specific web application.
    To configure it globally, edit the global web.xml file:
    [root@server ~]# vim JbossHomeDir/server/configName/deployers/jbossweb.deployer/web.xml
    To configure it for a single web app, edit that one web app's web.xml file:
    [root@server ~]# vim WARLocation/WEB-INF/web.xml
  5. Add the filter and, depending on the configuration, filter mapping elements to the file. This activates the response time filtering.
    All that is required for response time filtering to work is the default <filter> element, without any optional parameters. However, the parameters can be uncommented and set as necessary; the different ones are described in Section 21.6.1, “Parameters for User-Defined <filter>s”.
       <filter>
           <filter-name>RhqRtFilter</filter-name>
           <filter-class>org.rhq.helpers.rtfilter.filter.RtFilter</filter-class>
    
           <!-- Optional parameters. 
          
           <init-param>
               <param-name>chopQueryString</param-name>
               <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>logDirectory</param-name>
              <param-value>/tmp</param-value>
           </init-param>
           <init-param>
               <param-name>logFilePrefix</param-name>
               <param-value>localhost_7080_</param-value>
           </init-param>
           <init-param>
               <param-name>dontLogRegEx</param-name>
               <param-value></param-value>
           </init-param>
           <init-param>
              <param-name>matchOnUriOnly</param-name>
              <param-value>true</param-value>
           </init-param>
           <init-param>
               <param-name>timeBetweenFlushesInSec</param-name>
               <param-value>73</param-value>
           </init-param>
           <init-param>
               <param-name>flushAfterLines</param-name>
               <param-value>13</param-value>
           </init-param>
           <init-param>
               <param-name>maxLogFileSize</param-name>
               <param-value>5242880</param-value>
           </init-param>       
           -->
       </filter>
    
       <!-- Use this only when also enabling the RhqRtFilter in the filter
       <filter-mapping>
           <filter-name>RhqRtFilter</filter-name>
           <url-pattern>/*</url-pattern>
       </filter-mapping>
       -->
  6. Restart the JBoss EAP/AS server to load the new web.xml settings.
  7. Enable the HTTP metrics, as described in Section 21.6.5, “Configuring HTTP Response Time Metrics”, so that JBoss ON checks for the response time metrics on the application server.

21.6.3. Configuring Apache Servers for Response Time Metrics

  1. To use the Response Time module, the Apache server needs to have been compiled with shared object support. For Red Hat Enterprise Linux systems and EWS servers, this is enabled by default.
    To verify that the Apache server was compiled with shared object support, use the apachectl -l command to list the compiled modules and look for the mod_so.c module:
    [root@server ~]# apachectl -l
    Compiled in modules:
      core.c
      prefork.c
      http_core.c
      mod_so.c
    Use the - -enable-module=so option:
    [jsmith@server ~]$ ./configure - -enable-module=so
    [jsmith@server ~]$ make install
  2. Download the Apache binaries from the JBoss ON UI.
    1. Log into the JBoss ON UI.
      https://server.example.com:7080
    2. Click the Administration tab in the top menu.
    3. In the Configuration menu box on the left, select the Downloads item.
    4. Click the connector-apache.zip link, and save the file.
  3. Extract the Apache connectors.
    [root@server ~]# unzip connector-apache.zip
  4. Compile the Response Time module.
    Note
    apxs must be installed, and make must be installed and in the user PATH.
    [root@server ~]# cd apacheMOduleRoot/apache-rt/sources
    [root@server sources]# chmod +x build_apache_module.sh
    [root@server sources]# ./build_apache_module.sh 2.x apache_install_directory/bin/apxs
  5. Then, install the Response Time module on the Apache server.
    [root@server sources]# cp apache2.x/.libs/mod_rt.so apache_install_directory/modules
  6. Open the httpd.conf file. For example:
    [root@server ~]# vim apache_install_directory/conf/httpd.conf
  7. Enable the module in the Apache's httpd.conf file by appending this line to the end of the file:
    LoadModule  rt_module  modules/mod_rt.so
    LogFormat  "%S"  rt_log
    When setting the log format, the variable %S has a capital S.
  8. To configure response time logging for the main Apache server, add the following line at the top level of the file:
    CustomLog  logs/myhost.com80_rt.log  rt_log
    To configure response time logging for a virtual host, add the following line somewhere within the <VirtualHost> block:
    CustomLog  logs/myhost.com8080_rt.log  rt_log
    Make sure the response time log file name is different for the main server and each virtual host. Consider using the host and port from the ServerName directive be used to form the file name, such as host_port_rt.log.
  9. Restart the Apache server:
    [root@server ~]# apachectl -k restart
  10. To confirm that the Response Time module was installed successfully, check that the response time log files configured via the CustomLog directive now exist.
  11. Enable the HTTP metrics, as described in Section 21.6.5, “Configuring HTTP Response Time Metrics”, so that JBoss ON checks for the response time metrics on the application server.

21.6.4. Installing Response Time Filters for Tomcat

  1. Download the Response Time packages for Tomcat from the JBoss ON UI.
    1. Click the Administration tab in the top menu.
    2. In the Configuration menu box on the left, select the Downloads item.
    3. Click the connector-rtfilter.zip link, and save the file.
  2. Unzip the Response Time connectors.
    unzip connector-rtfilter.zip
    The package contains two JAR files, commons-logging-version.jar and rhq-rtfilter-version.jar. Tomcat 5 servers use only the commons-logging-version.jar file, while Tomcat 6 servers require both files.
  3. Copy the appropriate JAR files into the Tomcat configuration directory. The directory location depends on the Tomcat or JBoss instance (for embedded Tomcat) being modified.
    For example, on a standalone Tomcat 5.5:
    cp commons-logging-version.jar /var/lib/tomcat5/server/lib/
    On Tomcat 6:
    cp rhq-rtfilter-version.jar /var/lib/tomcat6/lib/
    cp commons-logging-version.jar /var/lib/tomcat6/lib/
    For example, on an embedded Tomcat instance:
    cp rhq-rtfilter-version.jar JBoss_install_dir/server/default/deploy/jboss-web.deployer/
    cp commons-logging-version.jar JBoss_install_dir/server/default/deploy/jboss-web.deployer/
  4. Open the web.xml file to add the filter definition. The exact location of the file depends on the server instance and whether it is a standalone or embedded server; several common locations are listed in Table 21.4, “web.xml Configuration File Locations”.
  5. Add either a <filter> or a <filter-mapping> entry to configuration the Response Time filter in the Tomcat server. Either a <filter> or a <filter-mapping> entry can be used, but not both.
    The most basic filter definition references simply the Response Time filter name and class in the <filter> element. This loads the response time filter with all of the default settings.
    <filter>
            <filter-name>RhqRtFilter </filter-name>
            <filter-class>org.rhq.helpers.rtfilter.filter.RtFilter </filter-class>
    </filter>
    The filter definition can be expanded with user-defined configuration values by adding <init-param elements. This loads the response time filter with all of the default settings.
    <filter>
            <filter-name>RhqRtFilter </filter-name>
            <filter-class>org.rhq.helpers.rtfilter.filter.RtFilter </filter-class>
            <init-param>
                    <description>Name of vhost mapping file. This properties file must be in the Tomcat process classpath.</description>
                    <param-name>vHostMappingFile</param-name>
                    <param-value>vhost-mappings.properties</param-value>
            </init-param>
    ...
    </filter>
    The available parameters are listed in Section 21.6.1, “Parameters for User-Defined <filter>s”.
    Alternatively, set a <filter-map> entry which gives the name of the response time filter and pattern to use to match the URL which will be monitored.
    <filter-mapping>
            <filter-name>RhqRtFilter </filter-name>
            <url-pattern>/* </url-pattern>
    </filter-mapping>
    Note
    Put the Response Time filter in front of any other configured filter so that the response time metrics will include all of the other response times, total, in the measurement.
  6. Restart the Tomcat instance to load the new configuration.
  7. Enable the HTTP metrics, as described in Section 21.6.5, “Configuring HTTP Response Time Metrics”, so that JBoss ON checks for the response time metrics on the application server.

Table 21.4. web.xml Configuration File Locations

Tomcat Version Embedded Server Type File Location
Tomcat 6 Standalone Server /var/lib/tomcat6/webapps/project/WEB-INF/web.xml
Tomcat 5 Standalone Server /var/lib/tomcat5/webapps/project/WEB-INF/web.xml
Tomcat 6 EAP 5 JBOSS_HOME/server/config/deployers/jbossweb.deployer/web.xml
Tomcat 6 JBoss 4.2, JBoss EAP4 JBOSS_HOME/server/config/deploy/jboss-web.deployer/conf/web.xml
Tomcat 5.5 JBoss 4.0.2 JBOSS_HOME/server/config/deploy/jbossweb-tomcat55.sar/conf/web.xml
Tomcat 5.0 JBoss 3.2.6 JBOSS_HOME/server/config/deploy/jbossweb-tomcat50.sar/conf/web.xml
Tomcat 4.1 JBoss 3.2.3 JBOSS_HOME/server/config/deploy/jbossweb-tomcat41.sar/web.xml

21.6.5. Configuring HTTP Response Time Metrics

Configuring response time metrics is in some respects analogous to configuring events. The JBoss ON agent polls certain log files kept by the web server to identify the performance times for different resources served by the web server.
  1. Install the response time filter for the web server.
    For Apache, simply install the filters; for Tomcat, there is additional configuration required to set up the filter entry in the web.xml file.
    If necessary, set up the filter entry in the web.xml file. See Section 21.6, “Setting up Response Time Monitoring for Apache, EWS/Tomcat, and JBoss EAP 5”.
  2. Click the Inventory tab in the top menu.
  3. Select the Servers menu table on the left, and then navigate to the web application, and select the web application context for which to run the response time monitoring.
  4. Click the Connection Settings tab on the web application context resource, and scroll to the Response Time configuration section.
  5. Configure the response time properties for the web server. The agent has to know what log file the web server uses to record response time data.
    Optionally, the server can perform certain transformations on the collected data.
    • The response times log records times for all resources the web server serves, including support files like CSS files and icons or background images. These resources can be excluded from the response time calculations in the Response Time Url Excludes field.
    • The same page can be accessed with different parameters passed along in the URL. The Response Time Url Transforms field provides a regular expression that can be used to strip or substitute the passed parameters.
  6. Click the Save button.
  7. Click the Monitoring tab on the web server resource entry.
  8. Click the Schedules subtab.
  9. Select the HTTP Response Time metric. This metric is the calltime type.
  10. Click the Enable at the bottom of the list.

Chapter 22. Resource Traits

One category of information that is collected about a resource is its traits. Traits are descriptive information, usually information that does not change very frequently.
For example, traits for a platform include its operating system name and version, its distribution, its architecture, and its hostname. Most resources have similar identifying information, such as a version number or vendor information.
Traits are in-between information. They are detectable and are detected by the JBoss ON agent's monitoring processes. But they are also generalized descriptive information. Trait information is even shown on the resource's details page.

Figure 22.1. Resource Details

Resource Details
The traits that are collected are defined in the resource plug-in itself, so this information is viewable but not configurable through the UI. The list of traits for each resource type is covered in the Resource Reference: Monitoring, Operation, and Configuration Options.

22.1. Collection Interval

By default for most resource types, traits are checked every 24 hours. Because traits change infrequently, they do not need to be collected very often.
The trait collection interval is configured the same as metrics collection intervals. To change the collection interval for a trait, see Section 19.3.2, “Setting Collection Intervals for a Specific Resource”.

22.2. Viewing Traits

The Traits subtab in the Monitoring tab for a resource displays three pieces of information:
  • The trait name. The traits which are monitored for a resource are defined with other monitoring settings in the resource type's plug-in descriptor.
  • The trait value.
  • The time of the last collection where a change in trait information was detected.

Figure 22.2. Trait Charts

Trait Charts

22.3. Extended Example: Alerting and Traits

The Setup

Trait information tends to be static. While traits can, and do, change, they do so infrequently. Also, traits convey descriptive information about a resource, not state data or dynamic measurements, so traits are not critical for IT administrators to track closely.

However, trait information can be valuable in letting administrators know when some configuration or package on the resource has changed because one trait that most resources collect is version information. If a version number changes, then some update has occurred on the underlying resource.

What to Do

For example, Tim the IT Guy has automatic updates configured for his Red Hat Enterprise Linux development and QA servers. Because his production environment has controlled application and system updates, there are no automatic updates for those servers.

Tim wants to be informed of version changes, but they are not necessarily a big deal. JBoss ON can issue an alert when a trait changes, with the Trait Value Change condition. (Cf. Section 25.1.2, “Basic Procedure for Setting Alerts for a Resource”.)

Figure 22.3. Trait Alert Condition

Trait Alert Condition
The alert for the development and QA systems is simple:
  • He sets two conditions, using an OR operator. The alert triggers when the distribution version changes or when the operating system version changes. This catches both minor and major updates to the operating system or kernel.
  • It is set to low priority so it is informative but not critical.
  • Tim decides that the alert notification is sent to his JBoss ON user, so he sees notifications when he logs in. He could also configure an email notification for high-priority resources.
For Tim's production resources, he sets the alert priority to high and uses an email notification to multiple IT administrators so that they are quickly aware of any change to the production systems.

Chapter 23. Resources Which Require Special Configuration for Monitoring

Web servers (Tomcat and Apache resources) require additional configuration for JBoss ON to be able to manage and monitor those resources.

23.1. Configuring Tomcat/EWS Servers for Monitoring

For instructions on setting up Tomcat or Red Hat JBoss Web Server (JWS) for monitoring with JBoss Operations Network, see the JBoss Web Server Installation Guide chapter on Monitoring Red Hat JBoss Web Server with JBoss ON.
NOTE
For more detail on configuring Tomcat, see the Tomcat documentation.

23.2. Configuring the Apache SNMP Module

To discover a Red Hat JBoss Web Server's virtual hosts and collect metrics for them, the SNMP module must be configured on that Red Hat JBoss Web Server.
Apache HTTP Server 2.2 is supported on Red Hat Enterprise Linux and Windows Server.
Important
To monitor Apache HTTP Server using JBoss ON, the SNMP module needs to be compiled prior to configuration following the instructions in Section 23.2.1, “Preparing the Apache SNMP module for use with Apache HTTP Server”.
The SNMP Plugin is already installed and enabled for JBoss ON 3.3, and the SNMP modules are pre-installed within Red Hat JBoss Web Server. To configure Red Hat JBoss Web Server for monitoring with JBoss ON:
  1. Open the httpd.conf file for editing.
    $ sudo vim JWS_install_directory/conf/httpd.conf
    
  2. Enable the module by adding these lines to the httpd.conf file under the Dynamic Shared Object Support section.
    LoadModule snmpcommon_module JWS_install_directory/modules/snmpcommon.so
    LoadModule snmpagt_module JWS_install_directorymodules/snmpmonagt.so
    		
    SNMPConf   JWS_install_directory/conf
    SNMPVar    JWS_install_directory/var
    
    For Windows Server:
    LoadModule snmpcommon_module modules/libsnmpcommon.so
    LoadModule snmpagt_module modules/libsnmpmonagt.so
    		
    SNMPConf   conf
    SNMPVar    var
    
  3. Verify the main configuration section of the httpd.conf file, as well as each <VirtualHost> configuration block, contains a ServerName directive with a port. The SNMP module uses this directive to uniquely identify the main server and each virtual host, so each ServerName directive must contain a unique value. For example:
    ServerName main.example.com:80
    ...
    		
    <VirtualHost vhost1.example.com:80>
    ServerName vhost1.example.com:80
    ...
    </VirtualHost>
    
  4. If there is more than one Apache instance on the same machine, it is possible to use different SNMP files for each instance.
    1. Each web server instance has its own httpd.conf file. Set the SNMPConf directory in each file to its own SNMP configuration directory. For example, for instance1:
      $ sudo vim instance1-httpd.conf
      
      SNMPConf /opt/apache-instance1/conf
      Then, for instance2:
      $ sudo vim instance2-httpd.conf
      
      SNMPConf /opt/apache-instance2/conf
      Each snmpd.conf file should be in the specified directory.
    2. Edit the agentaddress property in JWS_install_directory/conf/snmpd.conf so that each instance has a different value agent address and port, so there is no conflict between instances.
      See the snmpd.conf documentation for a description of this property's syntax.
  5. Restart Red Hat JBoss Web Server.
    $ sudo apachectl -k restart
  6. Verify that the SNMP module is installed and configured by inspecting the Red Hat JBoss Web Server error log for notices regarding the SNMP module:
    $ grep SNMP JWS_installation_dir/logs/error_log
    
    [Wed Mar 19 09:54:34 2008] [notice] Apache/2.0.63 (Unix) CovalentSNMP/2.3.0 configured -- resuming normal operations
    [Wed Mar 19 09:54:35 2008] [notice] SNMP: CovalentSNMP/2.3.0 started (user '1000' - SNMP address '1610' - pid '26738')
    
The HTTP Server can now be added to the JBoss ON monitored resources using the JBoss ON User Interface by selecting the Inventory tab, followed by the Discovery Queue under the Resources menu.

23.2.1. Preparing the Apache SNMP module for use with Apache HTTP Server

If the SNMP connectors are required for supported versions of Apache HTTP Server (other than Red Hat JBoss Web Server 2.x), they need to be compiled from source and installed.
Important
To use the Response Time module, the Apache server needs to have been compiled with shared object support. For Red Hat Enterprise Linux systems and Red Hat JBoss Web Server servers, this is enabled by default.
To verify that the Apache HTTP Server was compiled with shared object support, use the apachectl -l command to list the compiled modules and look for the mod_so.c module:
$ sudo apachectl -l
						
	Compiled in modules:
		  core.c
		  prefork.c
		  http_core.c
		  mod_so.c
To compile Apache HTTP Server with shared object support, use the --enable-module=so option:
$ ./configure --enable-module=so
$ make install
  1. Apache connectors can be compiled for other platforms, like Solaris, from the source files in connector-apache.zip.
  2. Download the source files for the Apache modules from the JBoss ON UI.
    1. Log into the JBoss ON UI.
      https://server.example.com:7080
    2. Click the Administration tab in the top menu.
    3. In the Configuration menu box on the left, select the Downloads item.
    4. Scroll to Connector Downloads, and click the connector-apache.zip link to download the Apache connectors.
    Note
    The zip file containing the source files for the SNMP module can be found at JON_SERVER_INSTALL_DIR/modules/org/rhq/server-startup/main/deployments/rhq.ear/rhq-downloads/connectors/connector-apache.zip.
  3. To compile the SNMP modules, you need to have the following packages installed: perl, make, automake and libtool.
    sudo yum install perl make automake libtool
    Additionally apxs should be present as part of your Apache installation.
  4. Unzip the Apache connectors and run the build script.
    $ sudo unzip connector-apache.zip
    $ sudo cd apache-snmp/sources/
    $ sudo chmod 755 ./build_apache_snmp.sh
    $ sudo ./build_apache_snmp.sh 2.0 [APACHE_INSTALL_DIR/sbin/apxs]
    
  5. Install the module from the directory where build_apache_snmp.sh is located. For example:
    $ sudo cd snmp_module_#
    $ sudo cp module/* apache_install_directory/modules
    $ sudo cp conf/* apache_install_directory/conf
    $ sudo mkdir apache_install_directory/var
    Where # is the Apache version (either 2.0 or 2.2)
    On Windows Server:
    > xcopy /e JON_AGENT_INSTALL_DIR\product_connectors\apache-snmp\binaries\x86
  6. The Apache SNMP module is now ready to be configured using the same process as configuring the module for Red Hat JBoss Web Server, detailed in Section 23.2, “Configuring the Apache SNMP Module”.

23.3. Metrics collection considerations with Apache and SNMP

Three metrics show values of zero when monitoring an Apache instance with the SNMP module:
  • Bytes received for GET requests per minute
  • Bytes received for POST requests per minute
  • Total number of bytes received per minute
This is because of how SNMP interprets information from the request body. First, SNMP provides various length values for the request body and a GET request does not have a body, so GET responses are not calculated and, therefore, have a value of zero. Second, Apache does not calculate a request body size if there is request chunking.

Chapter 24. Storing Monitoring Data

JBoss ON monitoring information reveals both current measurements and historical trends and averages. JBoss ON stores data in a kind of cascade, where raw data are aggregated and compressed on a schedule. This preserves the trends of data without inflating the size of the monitoring data. Data are handled like this:
  • Raw metrics are collected every few minutes and are aggregated in a rolling average in one-hour windows to produce minimum, average, and maximum values.
  • One-hour values are combined and averaged in six-hour periods.
  • Six-hour periods are combined and aggregated into 24-hour (1 day) windows.

24.1. Changing Storage Lengths for Monitoring Data

24.1.1. Default Storage Lengths

The raw measurements, one-hour periods, and six-hour periods are preserved in the JBoss ON database for a predefined lengths of time.

Table 24.1. Storage Times for Data

Data Length of Time Configurable or Hardcoded SQL or NoSQL Database
Raw measurements 7 days Hardcoded NoSQL
1-hour aggregate 14 days Hardcoded NoSQL
6-hour aggregate 31 days Hardcoded NoSQL
24-hour aggregate 365 days Hardcoded NoSQL
Traits 365 days Configurable SQL
Availability data 365 days Configurable SQL
Events data 14 days Configurable SQL
Response-time metrics 31 days Configurable SQL
Note
The storage size of raw metrics can grow very quickly. Be careful when adjusting the storage time of raw metrics to account for the size of your database and the number and frequency of all metrics collected across your inventory.
If you need to store raw data for long periods, consider exporting the raw data from the database and archiving it separately.
Note
All collected metrics are stored in a NoSQL database, in a specified storage node. Additionl nodes can be added on other systems, in a cluster, as the number of resources or collected metrics increases.
All other monitoring data (as well as all other data stored and used by JBoss ON) are stored in the backend SQL database.

24.1.2. Changing the Storage Times for Different Monitoring Data

  1. In the Configuration menu, select the System Settings item.
  2. Scroll to the Data Manager Configuration Properties section.
  3. Change the storage times for the different types of monitoring data.
    There are four settings that relate directly to storing monitoring data:
    • Response time data for web servers and EJB resources. This is kept for one month (31 days) by default.
    • Events information, meaning all of the log files generated by the agent for the resource. The default storage time for event logs is two weeks.
    • Traits for resources. The default time is one year (365 days).
    • Availability information. The default time is one year (365 days).

24.2. Exporting Raw Data

Raw monitoring data is, by default, purged from the database every week. To save the raw data, export it using the CLI.
The MeasurementDataManager class has a method to find the metric values for a specific resource within a certain time range:
findDataForResource(resourceId,[metricId],startTime,endTime,numberOfRecords)
For example, for a resource with the ID 10003 and a metric ID of 10473:
exporter.file = '/export/metrics/metrics.csv'
exporter.format = 'csv'
var start = new Date() - 8* 3600 * 1000;
var end = new Date()
var data = MeasurementDataManager.findDataForResource(10003,[10473],start,end,60)
exporter.write(data.get(0))

24.3. Deploying and Managing Storage Nodes

Raw metric data and aggregated metric data are stored in a dedicated, scalable distributed database. Much like the JBoss ON server, there can be multiple storage nodes in a cluster (installed independently of the server). Nodes can be added and dropped from the cloud easily, allowing the metrics storage to be dynamically expanded according to the needs of the environment.

24.3.1. About High-Speed Metrics Storage

As touched on in Section 16.3, “Potential Impact on Server Performance”, collecting metrics and generating alerts can be resource-intensive. It results in near-constant write operations on the backend database. That creates a natural threshold for the number of metrics that can be collected (30,000 per minute) before encountering performance degradation.
Note
The natural threshold of 30,000 metrics per minute is based on internal performance testing. This threshold varies depending on system resources and overall load.
JBoss ON uses two databases to store its information. One is a central relational database (PostgreSQL or Oracle) which stores all configuration about the JBoss ON servers and agents, all resource inventory data, resource configuration, and other data. The other database is a distributed database (a cluster of storage nodes) which stores all numeric monitoring data — in other words, all collected metrics.
The metrics storage node can be installed on its own dedicated machine, which can significantly improve write performance to the database (and, therefore, improve monitoring performance):
  • Dedicated CPU
  • More available physical memory
  • Faster disks
  • More disk space
These are the same performance considerations as installing the relational database on a separate machine from the JBoss ON server. By using two databases, it is also possible to move write-intensive and resource-intensive metrics storage away from resource-intensive configuration data, such as drift snapshots and bundle configuration.
Additionally, the distributed database can be expanded, with multiple nodes in a cluster. This ability to add additional nodes according to load is a crucial management tool for administrators. Rather than encountering that hardware-driven limit of 30,000 metrics collected per minute, additional nodes can be added to improve performance.
The storage node cluster is created and managed by JBoss ON (which also minimizes the metrics storage node management overhead). A storage node is installed on a system, using the rhqctl install --storage command. The storage node always requires a companion agent.
At least one storage node must be created and managed by JBoss ON (which minimizes the metrics storage node management overhead since no external tools are required).
There are several paths of communication to handle metrics data, all working in parallel.
  1. The agent sends the storage node configuration to the JBoss ON server. The JBoss ON server then sends that updated storage cluster information to every agent associated with a storage node.
    Each companion agent then updates its storage cluster configuration, in the rhq-storage-auth.conf, with the hostname or IP address of the new node. (Likewise, when a node is removed, the server sends the information to each of the companion agents, and the agent removes the hostname or IP address from the list in the local storage node's rhq-storage-auth.conf file.)
  2. The server receives monitoring data from all agents (not just those associated with a storage node), and sends that information to an available storage node to be stored.
  3. The storage nodes replicate their monitoring data among each other for high availability and integrity.

Figure 24.1. Server, Agent, and Metrics Storage Node Communication

Server, Agent, and Metrics Storage Node Communication
Node cluster communication requires three elements:
  • The hostname or IP address of every storage node, stored in the rhq-storage-auth.conf
  • A common port number for the JBoss ON server to use to communicate with the storage node (the client port)
  • A common port number for the other storage nodes in the cluster to use to sync data between each other (the gossip port)
The metrics storage provides data availability and integrity by backing up the data in two ways:
  • Replicating data between the storage nodes (over the gossip port)
  • Taking local snapshots and backing up the data locally

24.3.2. Deploying and Undeploying Storage Nodes

There can be multiple metrics storage databases in the JBoss ON environment. Much like the JBoss ON servers can operate in a cloud, multiple storage nodes communicate with each other to function in a cluster.
Nodes can be added and removed from the cluster by administrators or dynamically using JBoss ON scripts in response to changes in the environment.
With the default cluster configuration, a node is deployed as soon as it is installed and configured in the JBoss ON server. These are actually two separate steps.
  1. The bits are installed on a local system and the storage node is registered with the JBoss ON server.
  2. The new node information is deployed to the cluster.
Whether to deploy nodes automatically or manually can be determined by the provisioning and monitoring requirements of the environment.

24.3.2.1. Storage Node Requirements

There are two critical requirements for deploying nodes:
  • The hostnames and IP addresses of all storage nodes and the hostname and bind addresses of the JBoss ON server and agents must all be fully-resolvable in DNS. If the IP addresses and hostnames of the storage nodes, servers, or agents are not properly formatted in DNS, then all communication between the different JBoss ON components will fail.
  • The firewall must allow communication over the two ports used by the storage nodes. By default, the ports are 9142 and 7100 for the server/client and gossip ports, respectively.

24.3.2.2. Installing Additional Nodes

When creating a new storage node, two components are always installed: the storage node and a companion agent. The only required information to set up the node is the IP address of the JBoss ON server. The agent registers with the server and then sends the storage node hostname or IP address. The server then distributes that information among the other agents and it gets propagated into the cluster.
If the cluster is using custom client and gossip ports, then edit the rhq-storage.properties file with the correct ports before running the installation script.
Run the installation script with the --agent-preference option to supply the server bind address. For example:
[root@server ~]# serverRoot/jon-server-3.3.0.GA/bin/rhqctl install --storage --agent-preference="rhq.agent.server.bind-address=0.0.0.0"
Note
When installing on Linux, the rhqctl command must be run as root. On Windows, the command prompt must be opened with the option Run as Administrator.
The deployment operation can take several minutes to complete, as the new node information is propagated among the existing nodes. Until the deploy operation is complete, the node shows a status of Joining.

Figure 24.2. Joining the Cluster

Joining the Cluster
Warning
Deploying a node lists that node's host in the cluster configuration and any allowed host can gain access to the data in the storage cluster.
Restrict access to the rhq-storage-auth.conf file so that the allowed hosts list cannot be altered to allow an attacker to gain access to the cluster and the stored data.

24.3.2.3. Deploying Nodes Manually

By default, when a new storage node is installed, it is automatically deployed to the cluster. However, there is a cluster setting that disables automatic deployments. This can be useful if machines are being provisioned and taken offline repeatedly (such as in virtual environments) or if nodes should only be online during certain periods.
Warning
Deploying a node lists that node's host in the cluster configuration and any allowed host can gain access to the data in the storage cluster.
Restrict access to the rhq-storage-auth.conf file so that the allowed hosts list cannot be altered to allow an attacker to gain access to the cluster and the stored data.
To deploy a node manually:
  1. Install a node using the rhqctl install command.
  2. Click the Administration tab in the top navigation bar.
  3. In the Topology area on the left, select the Storage Nodes item.
  4. In the Nodes tab, select the row of the node to deploy.
  5. Click the Deploy button.
A node can also be deployed by running a Deploy operation on the storage node resource.
Note
Storage nodes can be deployed using the JBoss ON CLI and scripts, as well. This can be useful when provisioning new systems in the infrastructure.
For example:
// deploy a storage node 
nodes = StorageNodeManager.findStorageNodesByCriteria(StorageNodeCriteria()); 
node = nodes.get(0); StorageNodeManager.deployStorageNode(node);
This can be key in dynamically adding storage nodes as demands on the infrastructure increase. Nodes can be installed in advance and then hot-deployed and removed as necessary.

24.3.2.4. Removing Nodes

Undeploying a storage node removes it from the cluster and then removes the resource from the inventory and uninstalls the storage node bits from the machine.
  1. Click the Administration tab in the top navigation bar.
  2. In the Topology area on the left, select the Storage Nodes item.
  3. In the Nodes tab, select the row of the node to remove. To select multiple rows, hold the Ctrl key and click the desired rows.
  4. Click the Undeploy Selected button, and confirm the operation.
A node can also be removed by running an Undeploy operation on the storage node resource.
Note
Storage nodes can be removed using the JBoss ON CLI and scripts, as well. This can be useful when provisioning and removing systems from the infrastructure.
For example:
// undeploy a storage node 
nodes = StorageNodeManager.findStorageNodesByCriteria(StorageNodeCriteria()); 
node = nodes.get(0); StorageNodeManager.undeployStorageNode(node);

Chapter 25. Defining Alerts

25.1. Planning Alerts

An alert is a configuration setting that lets an administrator know that something has happened to a resource. Conditions and notifications are configured together in an alert definition for a resource.
There are three major components to an alert definition:

25.1.1. An Alerting Strategy in Four Questions

Alerting can be very basic, an immediate notification of something going on with your IT infrastructure. Alerting can also define complex scenarios and detailed responses, allowing administrators to respond automatically and proactively to problems in resources.
Both approaches are equally valid and equally useful, depending on your environment, management processes, and plans. Putting a little effort into planning your alerts — simple and complex — can make your overall management much easier.

25.1.1.1. What's the Condition?

This is probably the single most important thing to determine. What event do you want to know about?
An alert definition can have one condition or multiple conditions, and those conditions can trigger an alert if only one is true or only if all are true. One of the easiest things to do is picture the scenario that needs to be reported and work backward to determine what condition or set of conditions best represents that scenario.
Alerts can be very general ("there was a change") or very specific. General conditions are more informational. Very specific conditions can allow very specific responses — meaning that detailed and useful responses can be automated until an administrator can review and intervene.
There is no limit on the number of alert definitions that can be configured, apart from reasonable performance constraints. It may make sense to have one alert that sends a notification for a simple metric value change, while a series of other alerts look at, say, the change for that metric in relation to a different metric, and then runs a certain JBoss ON CLI script depending on what other factors are involved.

25.1.1.2. What's the Frequency?

A condition can occur and recur, across multiple monitoring scans. How many times do you need to be informed of the same condition?
There are a lot of factors to consider: what kind of response will be taken, what kind of audit trail you need to keep, how common a condition is likely to be, and how long a condition may exist.
Some factors in the alert definition such as dampening rules and disable/recover alert settings throttle how many notifications are sent.
One thing to remember is that an alert notification is every response that the JBoss ON server takes when an alert is triggered. It can be sending an email, posting a message in the UI, running a CLI script, or initiating an operation.
For common, relatively low-priority alerts, probably a single notification is sufficient, so dampening and disable rules can be set to prevent spamming administrators with emails or flooding the recent alerts portlet.
For infrequent issues, critical failures, or high priority resources, the frequency of the alert notification depends on what actions you need to take. Informational alerts may sent for as along as the condition persists, while an alert with an aggressive response like running a CLI script may only be run once and then disabled to prevent from repeatedly running the same script.

25.1.1.3. What's the Response to Take?

Responding to an alert has two phases: the immediate response and then long-term mitigation.
When an alert for a certain condition or event on your resource is fired, what is the very first thing that should happen? Should there be an email to an administrator? Should the JBoss ON server take some action to manage the resource? JBoss ON supports SNMP traps, so should an SNMP notification be sent to an external auditing/monitoring application?
Like conditions, there is no limit to the number of notifications that can be sent when an alert occurs. Multiple responses may make sense in some cases, where JBoss ON should email an administrator and restart a resource can simultaneously.
Operations and CLI script allow administrators to automate responses, and particularly for business-critical systems, this allows JBoss ON to take an immediate action before an administrator may even be aware there is a problem.
Once an immediate action has been taken (either by JBoss ON or by an administrator), then the administrator can acknowledge the alert, essentially dismissing it. That final step can be an important procedure, a way of stating that an administrator has reviewed the alert, identified what caused it, and found a solution.
Having a way of reviewing and signing off on events — not just as a JBoss ON task, but in real life — establishes a reliable procedure for handling IT problems and improving overall maintenance strategies.

25.1.1.4. How Many Resources Does This Affect?

Alert definitions can be set for an individual resource, for a group of compatible resources, or for an entire resource type.
Being able to apply the same definition to multiple resources is invaluable because of how easy it is to manage complex definitions.
Note
Alert templates for a resource type automatically apply to all existing and new resources of that type.
Be cognizant, though, of how different resources of the same type need to be alerted. For example, production servers require a very high level of reporting, monitoring metrics, and, possibly, automated response. QA or development servers may only require general alert information with little scripting or advanced responses. The number, frequency, and type of metrics collected on a production system may vary from those collected on QE or development systems, and that has an impact on what conditions can be used for alert definitions.
Grouping is an asset. QE, development, staging, and production resources can be separated into different groups, with the appropriate levels of monitoring and alerting set on each.

25.1.2. Basic Procedure for Setting Alerts for a Resource

Note
It is not possible to edit an alert condition or an alert notification after they are set. To change the conditions or notifications for an alert definition, delete the condition or notification and create a new one.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the resource name in the list.
  4. Click the Alerts tab for the resource.
  5. In the Definitions subtab, click the New button to create the new alert.
  6. In the General Properties tab, give the basic information about the alert.
    • Name. Gives the name of the specific alert definition. This must be unique for the resource.
    • Description. Contains an optional description of the alert; this can be very useful if you want to trigger different kinds of alert responses at different conditions for the same resource.
    • Priority. Sets the priority or severity that is given to an alert triggered by this definition.
    • Enabled. Sets whether the alert definition is active. Alert definitions can be disabled to prevent unnecessary or spurious alerts if there is, for instance, a network outage or routine maintenance window for the resource.
  7. In the Conditions tab, set the metric or issue that triggers the alert. Click the Add button to bring up the conditions form.
    Note
    There can be more than one condition set to trigger an alert. For example, you may only want to receive a notification for a server if its CPU goes above 80% and its available memory drops below 25MB. The ALL setting for the conditions restricts the alert notification to only when both criteria are met. Alternatively, you may want to know when either one occurs so that you can immediately change the load balancing configuration for the network. In that case, the ANY setting fires off a notification as soon as even one condition threshold is met.
    1. Click the Add a new condition button.
    2. From the initial drop-down menu, select the type of condition. The categories of conditions are described in Section 25.2.1, “Reasons for Firing an Alert”, and the exact conditions available to be set for every resource are listed in the Resource Monitoring Reference.
    3. Set the values for the condition.
  8. In the Notifications tab, click Add to set a notification for the alert.
    1. Select the method to use to send the alert notification in the Sender option.
      The Sender option first sets the specific type of alert method (such as email or SNMP) and then opens the appropriate form to fill in the details for that specific method.
    2. Fill in the required information for the alert sender method. The method may require contact information, SNMP settings, operations, or scripts, depending on what is selected.
  9. In the Recovery tab, set whether to disable an alert until the resource state is recovered. Optionally, select another alert to enable (or recover) when this alert fires.
    A recover alert takes a disabled alert and re-enables it. This is used for two alerts which show changing states, like a pair of alerts to show when availability goes down and then back up.
  10. In the Dampening tab, give the dampening (or frequency) rule on how often to send notifications for the same alert event.
    The frequency for sending alerts depends on the expected behavior of the resource. There has to be a balance between sending too many alerts and sending too few. There are several frequency settings:
    • Consecutive. Sends an alert if the condition occurs a certain number of times in a row for metric calculations. For example, if this is set to three, then the condition must be detected in three consecutive metric collection periods for the alert to be fired. If this is set to one, then it sends an alert every time the condition occurs.
    • Last N evaluations. This sets a number of times that the condition has to occur in a given number of monitoring evaluations cycles before an alert is sent.
    • Time period. The other two similar dampening rules set a recurrence based on the JBoss ON monitoring cycles. This sets the alerting rule based on a specific time period.
  11. Click OK to save the alert definition.

25.1.3. Enabling and Disabling Alert Definitions

When an alert definition is disabled, no alert notifications are triggered for that set of conditions. Disabling definitions is very useful when resources are being taken offline for a know reason (such as upgrades or maintenance) and any alerts triggered during that time would be wrong. Alert definitions can be re-enabled later just as easily.
  1. Click the Inventory tab in the top menu.
  2. Select the resource type in the Resources menu table on the left, and then browse or search for the resource.
  3. Click the Alerts tab.
  4. In the Definitions subtab, select any of the definitions to enable or disable.
  5. Click the Enable or Disable button.
  6. Confirm the action.

25.1.4. Group Alerting and Alert Templates

Most alerts can be defined consistently for multiple resources of the same type. JBoss ON has two ways to accomplish this:
  • Alert templates
  • Alerts on compatible groups
An alert template is a configuration setting for the JBoss ON server. An alert is configured for a specific resource type (even if no resource of that type exists in the inventory yet). Whenever a resource is added, any alert templates in the JBoss ON configuration are automatically applied to that resource. Alert templates can be configured to allow local changes (for example, Resource A may have different baselines or expected behavior, so the alert conditions can be altered). Templates can also be strictly enforced, so that every resource of that type has exactly the same settings.
Note
Alert templates for a resource type automatically apply to all existing and new resources of that type. This can be deselected when the template is edited, so that changes only apply to new resources.
Alerts can be configured on compatible groups. As with alert templates, the compatible group's alert definitions trickle down to the rest of the group members. When a resource is added to a group, the alerts are automatically added to the resource. When the resource is removed from the group, the alert is automatically deleted. Group alerting works for both manual groups and dynamic groups. As with alert templates, group alerts can allow local changes or enforce the group alert settings.
Templates make configuration really easy to apply consistently and often, and JBoss ON allows templates to be set for alerts based on their general resource type.
Group alerts, like alert templates, apply equally to every member of a compatible group. Group alerts offer more control over which resources have the alert definition, however, since resources can be manually added to the group or selected based on a search filter. When a resource joins or leaves a group, its alert definitions are automatically updated.

25.1.4.1. Creating Alert Definition Templates

Alert templates are fully defined alert definitions — from conditions to notification methods — that are created for any of the managed resource types in  JBoss ON. Servers or applications of the same type will probably have the same set of alert conditions that apply, such as free memory or CPU usage. An alert definition template creates an alert based on the general type of resource. So, there can be alert templates for Windows, Linux, and Solaris servers, Tomcat and Apache servers, and services like sshd and cron. Every time a resource of that type is added, then the alert definition is automatically added to the resource with the predefined settings. Any alert assigned to a resource through a template can be edited locally for that resource, so these alert definitions are still flexible and customizable.
Note
Alert templates for a resource type automatically apply to all existing and new resources of that type. This can be deselected when the template is edited, so that changes only apply to new resources.
To create an alert definition template:
  1. In the top navigation, open the Administration menu, and then the Configuration menu.
  2. Select the Alert Templates menu item. This opens a long list of resource types, both for platforms and server types.
  3. Locate the type of resource for which to create the template definition.
  4. Click the New button to create a global alert definition. Set up the alert exactly the same way as setting an alert for a single resource (as in Section 25.1.2, “Basic Procedure for Setting Alerts for a Resource”).
  5. Save the template.
The template definition is then applied to all current and new resources of that type.

25.1.4.2. Configuring Group Alerts

Group alerts can only be set on compatible groups.
  1. In the Inventory tab in the top menu, select the Compatible Groups item in the Groups menu on the left.
  2. In the main window, select the group to add the alert to.
  3. Click the Alerts tab for the group.
  4. In the Definitions subtab, click the New button.
  5. Configure the basic alert definition and notifications, as in Section 25.1.2, “Basic Procedure for Setting Alerts for a Resource”.

25.2. Alert Conditions

Monitoring (Chapter 19, Metrics and Measurements) is closely associated to alerting, in a larger work flow of keeping administrators aware of what is happening in their network. Alerting is based on conditions, the signal that an alert should be issued.
Conditions can be pretty straightforward — if A happens, do B — or they can be complex, with multiple conditions or combinations of readings required before an alert is initiated.

25.2.1. Reasons for Firing an Alert

The condition is any situation, event, or level on a resource that crosses a certain threshold. Basically, a condition sets parameters on what is "normal" behavior or performance for a resource. Once it crosses that boundary, JBoss ON issues an alert. This can be a metric value that has changed to an undesirable level, an event, or a recurring metric reading.
Alerts through alert definitions against are defined for individual resources or for compatible groups of resources. An alert definition specifies the conditions that trigger the alert and the type and settings of any notification that should be triggered.
When an alert is registered, the alert identifies the alert definition which was triggered (which identifies the alert condition) and the metric or event value which precipitated the alert.
An alert conditions answers four questions: what, when, who, and where. The what is the threshold or condition that triggers the alert (such as the free memory drops below a certain point). The when sets the frequency or timing for sending an alert using a defined dampening rule. And the who and where controls how administrators are notified of the alert.
A single condition can be enough to issue an alert, or an alert definition can require that an alert is issued only if multiple conditions are met simultaneously. This provides very granular control over when an alert is issued, which makes alerting information more valuable and relevant.
A condition can be based on any detectable monitoring or system metric, listed in Section 25.2.1, “Reasons for Firing an Alert”. These alert conditions correspond directly to the monitoring metrics available for that type of resource. All of the possible metrics for each resource type are listed in the Resource Monitoring Reference.

Table 25.1. Types of Alert Conditions

Condition Type Description
Metric A specific monitoring area that is checked and the thresholds for that area which trigger a response. Metrics are usually numeric responses of some sort (e.g., percent CPU usage, number of requests, or a cache hit ratio).
Trait A change in a value for a specific setting. Traits are usually string values.
Availability A sudden change in whether the resource is available or unavailable.
Operation A specific action or task that is performed on the resource.
Events A certain type of error message, matching a given string, is recorded. Events are filtered from system or application log files, and the types of events recognized in JBoss ON depend on the event configuration for the resource.
Drift A resource has changed from a predefined configuration.

25.2.2. Detailed Discussion: Ranges, AND, and OR Operators with Conditions

Alerting is based on monitoring information. It is an extension that allows an administrator to receive a notification or define an action to take if a certain event or metrics value occurs.
The monitoring point that triggers an alert is the alert condition. At its most simplistic, an alert condition is a single event or reading. If X occurs, then that triggers an alert.
In real life, X may not be enough to warrant an alert or to adequately describe the state of a resource. Different conditions may require the same response or a situation may only be critical if multiple conditions are true. Alerting is very flexible because it allows multiple conditions to be defined with established relationships between those conditions.
The next level of complexity is to send an alert if either X or Y is true. In the alert definition, this is the ANY option, which is a logical OR. The alert definition checks for any of those conditions, but those conditions are still unrelated to each other.
The last level of complexity is when the conditions have to relate to each other for an alert to be issued. This is the ALL option, which is a logical AND. Both X and Y must occur for the alert to be issued. In this case, when one condition occurs, the server puts a lock on that definition and begins waiting for the second condition to occur. When the second condition occurs, then the alert is issued.
An AND operator is very effective on different metrics, but because the conditions do not have to occur simultaneously, using a simple AND operator does not make sense for the same metric.
For example, Tim the IT Guy only wants an alert to be issued when the user load is between 40% to 60%, indicating slightly increased loads on his platform. Attempting to use an AND operator returns strange values when the load spikes over 70% (which trips the above 40% condition) and then falls back to 15% (which triggers the below 60% condition).
In this case, Tim uses a range condition. A range requires two values from the same metric that are within the given boundaries. A range can be inside values (40-60%) or it can be an outside range (below 40% and above 60%).

Figure 25.1. Alert Condition Range

Alert Condition Range

25.2.3. Detailed Discussion: Conditions Based on Log File Messages

Events (Chapter 20, Events) are filtered log messages. Certain resource in JBoss ON maintain their own error logs, like platforms and JBoss EAP servers. JBoss ON can scan these error logs to detect events of certain severity or events matching certain patterns. This allows JBoss ON administrators to provide an easy way to identify and view important error messages.
Because JBoss ON can detect log events, JBoss ON can alert on log events. An event-based condition requires the severity of the log file message and, optionally, a pattern to use to match specific messages.

Figure 25.2. Log File Conditions

Log File Conditions
Setting only a severity alerts on any event with that severity. That is useful for severe or fatal errors, which are relatively infrequent and need immediate attention.
Note
For general error message, use a pattern to filter for a specific error type. Then, use a resource operation or CLI script to take a specific action to address that specific error, like restarting a resource or starting a new web app.

25.2.4. Detailed Discussion: Dampening

Dampening does not define an alert condition, but it tells the JBoss ON server how to handle recurring conditions.
An alert can be issued every single time a condition is met, or an alert can be issued and then disabled until an administrator acknowledges it. Dampening a condition is useful to prevent multiple alerts and notifications from being sent for a single ongoing set of circumstances.

Figure 25.3. Dampening Filter

Dampening Filter
For example, Tim the IT Guy sets an alert to let him know if his platform CPU usage spikes above 80%. Every time the monitoring scan runs, the metric is again determined to be true, and is treated as a new and discrete instance. If the metric schedule is set for 10 minutes and it takes him an hour to respond to the alert, he could have six or seven alert notifications before he has a chance to respond. If the only alert response is an email, this is an annoyance, but not a problem.
If, however, Tim the IT Guy has an alert response to create a new JBoss server if his EAP connection count goes above a certain point, the same response could be taken six or seven times, when it actually should be done once and then the condition will naturally resolve itself.
Dampening is another set of instructions on how to evaluate the condition before triggering another alert. It tells JBoss ON how to interpret those monitoring data.
  • JBoss ON could send an alert every time the condition is encountered. In that case, there would be multiple alerts issued if the CPU percentage bounced around, while only one alert would be sent if it hit it briefly or hit it and stayed there.
  • JBoss ON could send an alert only if the condition was encountered a certain number of times consecutively or X number of times out of Y number of polls. In this case, only a recurring or sustained problem would trigger an alert. A momentary spike or trough wouldn't be enough to fire a notification.
    A condition may need to occur several times over a short period of time for it to be a problem, but once is not a problem. For example, a server may bounce between 78% and 80% CPU over several minutes, it could hit 80% once for only a few seconds, or it could hit 80% and stay there. The condition may only be relevant if the CPU hits 80% and stays, and the other readings can be ignored.
  • A notification is sent only if the problem occurs within a set time period. This can be useful to track the frequency of recurring problems or to track how long a condition persisted.
Note
Dampening is only relevant for metrics which are variable and compared to some kind of baseline. Monitoring metrics with thresholds and value changes are dynamic, so each reading really is a new condition, even if it matches the previous reading. Dampening, then, controls those multiple similar readings.
Conditions which relate directly to a status change do not compare themselves against a baseline — they only compare against a previous state. For example, if the system configuration changes from the drift template or if the availability changes, that is a one-time change. After that, the resource has a new status and future changes would be compared against the new status — so, in a sense, it is a different condition.
Dampening does not apply conceptually for drift and availability changes.

25.2.5. Detailed Discussion: Automatically Disabling and Recovering Alerts

There are a couple of different ways to limit how often an alert notification is sent for the same observed condition. One method is a dampening rule. An alternative is to disable an alert the first time it is fired, and