JBoss Operations Network 3.1.1

Frequently Asked Questions

common questions, issues, and fixes

JBoss Operations Network Team

Legal Notice

Copyright © 2012 Red Hat, Inc..
This document is licensed by Red Hat under the Creative Commons Attribution-ShareAlike 3.0 Unported License. If you distribute this document, or a modified version of it, you must provide attribution to Red Hat, Inc. and provide a link to the original. If the document is modified, all Red Hat trademarks must be removed.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, MetaMatrix, Fedora, the Infinity Logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack Logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.
June 12, 2012, updated October 3, 2012

Abstract

These questions cover some common issues, setup information, requirements, and errors for JBoss ON.
1. General
Q: What is the difference between JBoss Operations Network and RHQ?
Q: Is there a publicly available issue tracker system to search for bugs and submit enhancement requests?
Q: What databases are supported?
Q: Why can't I start JBoss ON with Java 5?
Q: How can I find what my user preferences are?
Q: What is the syntax for regular expressions used within JBoss ON?
Q: How often does JBoss ON check the availability of resources?
Q: Why is the JBoss ON agent waiting at startup?
Q: How do I install a supported version of PostgreSQL on Red Hat Enterprise Linux?
Q: How can I run SQL commands against the JBoss ON database from the JBoss ON console?
Q: Is JBoss ON supported on VMWare?
Q: To help debug Out Of Memory conditions, how do I get the agent or server to dump heap when it runs out of memory or on demand?
2. Installation and Upgrade Issues
Q: I'm seeing error messages when I install (or upgrade) my server. What do they mean?
Q: I've installed my server, but I can't connect to it. What's wrong?
Q: The installer fails on PostgreSQL with "Relation RHQ_Principal does not exist."
Q: I upgraded my server, but when I try to connect to the installer page to configure it, it keeps trying to redirect me to the (old) coregui/ module. How do I get to the installer?
Q: The JBoss ON install fails on Oracle with the ORA-01843.
3. User Interface
Q: How can I ignore an auto-discovered resource?
Q: I selected a search suggestion from the resource search box, but I didn't get any results. Why?
Q: Errors and stack traces in the GWT Message Center are sometimes not helpful. How can I find out what the real problem is?
Q: Why are the graphs and charts on the MONITOR tab in the GUI not displayed?
4. Server
Q: When I start the server, I see servlet errors in my logs. What's wrong?
Q: How do I get debug messages from the JBoss ON server?
Q: How can I specify command-line options for the server JVM?
Q: How do I purge my schema of all data?
Q: How can I debug JDBC access and trace SQL?
Q: How can I confirm my server's email/SMTP settings are correct?
Q: My server machine does not have a writable directory called /var/run. How can I get my rhq-server.sh script to successfully write out its pid file?
Q: When I try to start the server, I get an exception with the cause "Exception creating identity" and the server fails to start. How can I fix this?
Q: My server logs are showing the message "Have not heard from agent ... Will be backfilled since we suspect it is down." What does that mean?
Q: What ports do I have to be concerned about when setting up a firewall between servers and agents?
Q: I installed the server as a Windows service, but it is failing to start with no error messages. How can I start the server as a Windows service?
Q: How do I fix an ORA-12519, TNS:no appropriate service handler found error when using Oracle XE?
Q: I am seeing this error in my server logs or stack trace: WARN [QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory. What does that mean and what is causing it?
Q: How do I stop the server from periodically logging messages that say a plug-in is "the same logical plug-in" but has "different content" and "will be considered obsolete"?
Q: What is the difference between LDAP user authentication and LDAP group authorization in JBoss ON?
Q: How do I set up LDAP group authorization?
5. Agent
Q: I have a physical machine hosting multiple virtual machines with shared disk resources. How can I run an agent on each virtual instance?
Q: How do I get debug messages from the JBoss ON agent?
Q: How do I restrict which agents are allowed to connect to the server?
Q: Do I have to run the agent as root?
Q: How do I clean start the JBoss ON agent, as if newly installed?
Q: How can I do a "clean config" for an agent running as a background Windows service?
Q: How can I update the plug-ins on all my agents?
Q: How can I change the agent name after it has already been registered?
Q: I want to run agents on all my machines, but only one starts OK. The rest fail due to binding to a wrong address.
Q: When starting the agent via a Windows service, the agent fails to start, and I see the error "java.lang.IllegalStateException: The name of this agent is not defined - you cannot start the agent until you give it a valid name" in the agent wrapper log file. What does this mean?
Q: My agent setup is correct but my agent is getting the following message: "Cause: org.jboss.remoting.CannotConnectException: Can not connect http client invoker."
Q: My agent machine does not have a writable directory called /var/run. How can I get my rhq-agent-wrapper.sh script to successfully write out its pid file?
Q: How often does the agent scan for resources?
Q: How can I view the agent's persisted configuration?
Q: How can I find out what environment variables and Java system properties are set in my agent JVM process?
Q: How can I get a dump of inventory information from an agent running on another machine?
Q: I need to change the IP address of my agent machine. How do I keep my server and agent up to date with that change?
Q: How can I stop my agent from thinking the server keeps going up and down when the server has remained running the whole time?
6. Log Messages
Q: What are "Command failed to be authenticated" messages?
Q: What are "fail-safe cleanup" messages?
7. Server and Agent Plug-ins
Q: How can I extend JBoss ON?
Q: How can I write a plug-in for JBoss ON?
Q: What is the skeleton plug-in module?
8. General Resource Questions
Q: I deleted a Platform from inventory. How can I rediscover it, so I can re-import it?
Q: On a Red Hat Enterprise Linux platform, interface "sit0" is discovered, but it is always red. How can I remove this interface from inventory?
Q: How can I collect syslog messages as JBoss ON events?
Q: Executing a script resource fails on Red Hat Enterprise Linux.
9. JBoss Resources
Q: Why does only one JBoss AS server show green availability and all the rest show red, even though I made sure all of my JNP credentials are configured properly in my resources' connection properties?
Q: When I import a server like JBoss EAP 5 or Tomcat, I see its child JVM resource in inventory, but it is red (DOWN). Why?
Q: When trying to monitor a JBoss EAP instance, I get the error "Connection failure Failed to authenticate principal=null, securityDomain=jmx-console."
Q: When monitoring a JBoss AS instance, I'm not seeing any JVM resources beneath it.
Q: Can I monitor JBoss AS 5.1?
Q: My agent can detect my JBoss server and gets its connection properties, but the JNP connection fails. Why?
10. Postgres Resources
Q: Why is the agent showing an error in my PostgreSQL discovery about authentication failed for user "postgres"?
Q: Why are most of the metrics for my Postgres resource showing up as NaN?
Q: How many database connections are necessary to monitor a Postgres database?
Q: Why can't I drop my database that is inventoried in JBoss ON?
11. Apache Resources
Q: Where can I get the Apache connectors?
Q: I have instrumented Apache with the Response Time module, but no RT metrics are being shown for my VirtualHosts.
Q: Some of my Apache metrics show values of zero. Why?
Q: What is the Augeas plug-in?
Q: Why does my agent log have the error "java.lang.UnsatisfiedLinkError: Unable to load library 'augeas': libaugeas.so: cannot open shared object file: No such file or directory"?
Q: Why does my Apache SNMP module fail to start with an error?
12. Tomcat Resources
Q: I get the error "This resource's configuration has not yet been initialized" when I go to the Configuration tab for a Tomcat resource. Why?
13. Provisioning and Content
Q: When I try to create a bundle by uploading a Ant recipe XML directly, the XML content seems to get corrupted and tags are placed out of order.
Q: What does the JBoss Patch Content feature of JBoss ON actually do? Is it completely automated?
14. Alerts
Q: I just created an alert definition, and I know that my agent reported data that should have fired an alert immediately. But I don't see an alert. Why not?
Q: Why do I see alerts triggered on different metric values on different alert definition conditions when they are using the same metric?
15. Monitoring
Q: Why does the Events tab not capture the events of one of the "Log Event Sources"?
Q: When do baselines auto-calculate?
16. Operations
Q: I clicked the Operations tab, but I don't see any available operations, and it says "No items to show." How do I schedule an operation?

1. General

Q:
What is the difference between JBoss Operations Network and RHQ?
A:
RHQ is the upstream, open source version of JBoss Operations Network. JBoss Operations Network is a commercial product which is built on RHQ, with extensive testing and official customer support through Red Hat.
Q:
Is there a publicly available issue tracker system to search for bugs and submit enhancement requests?
A:
To search for a bug, report a bug, or submit an enhancement request, use Bugzilla. Select the Other distribution and then the RHQ Project component.
Q:
What databases are supported?
A:
PostgreSQL 8.2.4 and higher 8.2.x versions; all releases of PostgreSQL 8.3, 8.4, and 9.0; and Oracle 11 are supported databases.
The full list of supported databases, server and agent platforms, and Java versions is available at http://www.redhat.com/jboss_on/requirements.
Q:
Why can't I start JBoss ON with Java 5?
A:
Java 5 is no longer supported. Upgrade to Java 6.
The full list of supported databases, server and agent platforms, and Java versions is available at http://www.redhat.com/jboss_on/requirements.
Q:
How can I find what my user preferences are?
A:
In either the database client or from the http://server.hostname:7080/admin/test/sql.jsp page, run this SQL command:
select id, name, string_value 
from rhq_config_property 
where configuration_id = (select configuration_id 
from rhq_subject 
where name = 'your-user-name')
Q:
What is the syntax for regular expressions used within JBoss ON?
A:
JBoss ON uses regular expressions in several places in both the UI and in configuration files. All of the regular expressions follow the standard Java format, as covered in the Sun documentation for regex syntax and date/time format syntax.
Q:
How often does JBoss ON check the availability of resources?
A:
Agents check for resource available every minute for servers and every 10 minutes for services. The agent itself runs an availability check every 30 seconds, on a subset of its total inventory; this keeps a steady load on the agent and prevents memory-intensive usage spikes. The agent sends the results to the JBoss ON server.
The frequency that a specific resource is checked for availability is scheduled, much like scheduling the frequency for a metric. This availability check interval can be changed for a resource on its Monitoring > Schedules tab.
The frequency that the agent itself runs an availability check is defined in the rhq.agent.plugins.availability-scan.period-secs setting. The default is 30 seconds. For performance reasons, it should never be lower than 30 seconds. It is possible to extend the scan interval by setting a new interval as one of the ADDITIONAL_JAVA_OPTIONS values. For example:
RHQ_AGENT_ADDITIONAL_JAVA_OPTS="-Drhq.agent.plugins.availability-scan.period-secs=45"
Q:
Why is the JBoss ON agent waiting at startup?
A:
Sometimes when you start the JBoss ON agent from the command line, it will not proceed to the sending> prompt, but sit there and wait. There are several possible reasons for this.
The server has rejected the agent registration request.
If the agent returns this message at start up, it means that the agent is known to the server under one name but is sending a different name when it starts:
Cause: [org.rhq.core.clientapi.server.core.AgentRegistrationException:The 
agent asking for registration is trying to register the same address/port 
[172.31.7.7:16163] that is already registered under a different name [example]; 
if this new agent is actually the same as the original, then re-register with
the same name]
To solve this, start the agent with option --clean and give the correct name.
The agent cannot reach the server.
This is an agent state where the server cannot be reached because the server is down or because a firewall has blocked the traffic. Make sure port 7080 on the server machine is reachable from the agent's machine. You can check this through a web browser.
The server cannot connect to the agent.
An error saying that the server cannot ping the agent's endpoint means that the agent can communicate with the server, but the server cannot communicate with the agent. This may mean that the agent port is blocked by a firewall.
The server has rejected the agent registration request. Cause: 
[org.rhq.core.clientapi.server.core.AgentRegistrationException:Server cannot 
ping the agent's endpoint. The agent's endpoint is probably invalid or there 
is a firewall preventing the server from connecting to the agent. Endpoint:
socket://172.31.7.3:12345/....
The agent does not have plug-ins - it will now wait for them to be downloaded.
This usually means that the server has a different security token than the one the agent was sending. This could have resulted from the java preferences entry being mangled, for example, by testing with different agent versions or VMs.
You will see this message only on initial agent startup when it does not have any plug-ins. If plug-ins were downloaded in a previous run, you will probably run in the situation shown below.
If you see this on the agent, you should also see messages like this on the server:
11:40:48,454 WARN [CommandProcessor] {CommandProcessor.failed-
authentication}Command failed to be authenticated! This command will be 
ignored and not processed: Command: type=[remotepojo]; cmd-in-response=
[false]; config=[{rhq.security-token=1217855913569-109582636-403140853869881172, rhq.send-throttle=true}];
params=
[{targetInterfaceName=org.rhq.core.clientapi.server.core.CoreServerService, 
invocation=NameBasedInvocation[getLatestPlugins]}]
To solve this, start the agent interactively with the --clean option.
Agent startup is OK, but ping command fails.
Here, the agent successfully starts, but there may be other agent communication problems, like no monitoring data are sent. Trying to ping the agent command line can return an error like the following:
sending> ping
	
	
Pinging...
	
	
Failed to execute prompt command [ping]. Cause: 
org.rhq.enterprise.communications.command.server.AuthenticationException:Command 
failed to be authenticated! This command will be ignored and not processed: 
Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-
token=1214208960346-102975580-7334156733284942657, rhq.send-throttle=true}]; 
params=[{targetInterfaceName=org.rhq.enterprise.communications.Ping, 
invocation=NameBasedInvocation[ping]}]
The server log will probably contain several CommandProcessor.failed-authentication messages. This means that the agent's port is probably blocked by a firewall so that the server cannot communicate with it.
The forward and backward mappings of the IP address for high availability don't match.
Make sure the IP address of your computer can be reverse-mapped to the computer name, and that this name maps back to the same IP address. This needs to be true for all your hosts.
For example, if you have an IP address of 172.31.7.7, then the results of name resolution will look like the following:
$ dig -x 172.31.7.7
[...]
;; ANSWER SECTION:
7.7.31.172.in-addr.arpa. 86400 IN     PTR     example
$
$ dig example
[...]
;; ANSWER SECTION:
example       74030   IN      A       172.31.7.7
If your agent-server communication was working and it stopped suddenly, go to the JBoss ON server UI. Then go to Administration > High Availability > Server and check if the name displayed matches what you expect. Check the same for the agents.
If this all works, and the agent is still hanging, there may be other possibilities for misconfiguration.
Q:
How do I install a supported version of PostgreSQL on Red Hat Enterprise Linux?
A:
The default PostgreSQL version on Red Hat Enterprise Linux 5.5 is an older, unsupported version of PostgreSQL, so it has to be updated.
To install Postgres from Red Hat Network:
  1. Log into http://rhn.redhat.com with your RHN/JBoss credentials.
  2. Add the Red Hat Application Stack v2 channel.
  3. Update the system:
    sudo yum update
  4. Then update PostgreSQL specifically:
    sudo yum install postgresql-server
    The data directory is installed in /var/lib/pgsql/data. JBoss ON supports PostgreSQL 8.2.4 and later 8.2.x versions and all releases of PostgreSQL 8.3, 8.4, and 9.0.
  5. Install and configure the JBoss ON server as normal.
Q:
How can I run SQL commands against the JBoss ON database from the JBoss ON console?
A:
Go to the SQL page:
http://server.example.com:7080/admin/test/sql.jsp
Q:
Is JBoss ON supported on VMWare?
A:
VMWare ESX is supported by Red Hat Enterprise Linux as equivalent to bare metal. In the case of a hardware issue with a VMWare product, work with VMWare support to resolve the issue.
JBoss EAP is fully supported on all current versions of Red Hat Enterprise Linux. Support levels and SLA will vary depending on the entitlements you purchased.
Q:
To help debug Out Of Memory conditions, how do I get the agent or server to dump heap when it runs out of memory or on demand?
A:
Pass these JVM arguments to the server or agent (setting the RHQ_AGENT_ADDITIONAL_JAVA_OPTS or RHQ_SERVER_ADDITIONAL_JAVA_OPTS variables).
-XX:+HeapDumpOnOutOfMemoryError -XX:+HeapDumpOnCtrlBreak
To drop the heap dump file in a particular location, add a path:
-XX:HeapDumpPath=location
See the SUN JVM Debugging Options for more info.

2. Installation and Upgrade Issues

Q:
I'm seeing error messages when I install (or upgrade) my server. What do they mean?
A:
During the upgrade, you may see error messages in the console similar to the following:
ERROR [ClientCommandSenderTask] {ClientCommandSenderTask.send-failed}Failed to send
command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.timeout=1000,
rhq.send-throttle=true}]; params=[{targetInterfaceName=org.rhq.enterprise.communications.Ping,
invocation=NameBasedInvocation[ping]}]]. Cause: org.jboss.remoting.CannotConnectException:[.....]
These can be ignored.
Q:
I've installed my server, but I can't connect to it. What's wrong?
A:
If the installer was not bound to 0.0.0.0 when it was run, then the necessary connection properties are not set for the server in the rhq-server.properties file. The java.rmi.server.hostname parameter must be set manually to the real IP address of the server, which matches the value of the jboss.bind.address parameter. Restart the server after editing the rhq-server.properties file to load the new settings.
Q:
The installer fails on PostgreSQL with "Relation RHQ_Principal does not exist."
A:
First, ensure that the JBoss ON server installer has the correct permissions to connect to PostgreSQL. Open the PostgreSQL configuration file pg_hba.conf and check that the permissions have been enabled. The Installation Guide has more information on setting up PostgreSQL for installation.
Q:
I upgraded my server, but when I try to connect to the installer page to configure it, it keeps trying to redirect me to the (old) coregui/ module. How do I get to the installer?
A:
Some browsers cache the previous coregui/ module and attempt to redirect you there automatically after upgrading, even though the upgraded coregui/ module has not yet been loaded.
Simply navigate directly to the installer page:
http:/server.example.com:7080/installer/start.jsf
Q:
The JBoss ON install fails on Oracle with the ORA-01843.
A:
This issue occurs when Oracle runs in a locale where the abbreviation for April is not APR, as in English and German. There are currently two workarounds:
  • Put Oracle in a different locale.
  • Edit one of the server distribution files before running the installer:
    1. Remove the old server directory and unzip the install package again.
    2. Open the serverRoot/jon-server-3.1.0.GA1/jbossas/server/default/rhq-installer.war/WEB-INF/classes directory.
    3. Edit db-data-combined.xml. Update a few dates in the form 01-APR-08 to be in the current locale.
    4. Save the file.
    5. Re-run the installer and choose to overwrite the database.

3. User Interface

Q:
How can I ignore an auto-discovered resource?
A:
If your agent discovers a new platform and finds a few resources that you do not want to take into inventory, you have to tell the server to ignore those resources.
First, you can select the resources to import in the auto-discovery portlet and deselect the unwanted resources. As long as they are displayed in the portlet, they are not imported.
The other option is to select the resource you do not want to import and click on Ignore, so it no longer appears in the portlet. However, if you try this on a resource on a freshly discovered platform, it will fail. The reason for this is that the inventory is organized in a tree-like manner with the platform as a tree-root. When a server or service is taken into the system, regardless of whether it is imported or ignored, it is attached below that root. When the platform is not yet imported into the inventory, there is no root that the ignored resource can be attached to.
You can ignore a server on a platform by performing the following steps:
  1. Import the platform and leave that server unchecked.
  2. When the platform is successfully imported, select the server and click Ignore.
It is not possible to just ignore a platform. If you want to ignore a platform, do not run an agent on it.
Q:
I selected a search suggestion from the resource search box, but I didn't get any results. Why?
A:
The suggestions in the search drop-down are not filtered by category — but the search results are. For example, if you are in the Server tab, the dynamic search suggestions will prompt for type==CPU, even though CPU is in the service category. If you select type==CPU, then nothing is returned in the search, because the search results are filtered by the category, and the search is implicitly set to the server category.
Q:
Errors and stack traces in the GWT Message Center are sometimes not helpful. How can I find out what the real problem is?
A:
If there are errors in the UI, an error ID is displayed, enclosed in square brackets. That can be used to track down the error and stack trace in the JBoss ON server's log file. For example:
java.lang.RuntimeException:[1312480384219] ...
The server-side log information is more useful because these exceptions are occur on the server-side and are forwarded to the GWT client.
Q:
Why are the graphs and charts on the MONITOR tab in the GUI not displayed?
A:
Errors like the following can appear in the JBoss ON server log:
java.lang.NoClassDefFoundError: Could not initialize class org..enterprise.gui.image.chart.ColumnChart
To generate the text in graphs and charts, Java requires specific system fonts to be installed. Error messages will appear if the required font package is not installed. On Red Hat Enterprise Linux, make sure the urw-fonts package is installed:
yum install urw-fonts

4. Server

Q: When I start the server, I see servlet errors in my logs. What's wrong?
Q: How do I get debug messages from the JBoss ON server?
Q: How can I specify command-line options for the server JVM?
Q: How do I purge my schema of all data?
Q: How can I debug JDBC access and trace SQL?
Q: How can I confirm my server's email/SMTP settings are correct?
Q: My server machine does not have a writable directory called /var/run. How can I get my rhq-server.sh script to successfully write out its pid file?
Q: When I try to start the server, I get an exception with the cause "Exception creating identity" and the server fails to start. How can I fix this?
Q: My server logs are showing the message "Have not heard from agent ... Will be backfilled since we suspect it is down." What does that mean?
Q: What ports do I have to be concerned about when setting up a firewall between servers and agents?
Q: I installed the server as a Windows service, but it is failing to start with no error messages. How can I start the server as a Windows service?
Q: How do I fix an ORA-12519, TNS:no appropriate service handler found error when using Oracle XE?
Q: I am seeing this error in my server logs or stack trace: WARN [QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory. What does that mean and what is causing it?
Q: How do I stop the server from periodically logging messages that say a plug-in is "the same logical plug-in" but has "different content" and "will be considered obsolete"?
Q: What is the difference between LDAP user authentication and LDAP group authorization in JBoss ON?
Q: How do I set up LDAP group authorization?
Q:
When I start the server, I see servlet errors in my logs. What's wrong?
A:
As the server starts and if agents are already running, there can be errors related to the Servlet.service() class recorded in the logs:
22:55:35,319 ERROR [[ServerInvokerServlet]] Servlet.service() for servlet
ServerInvokerServlet threw exception
java.lang.reflect.UndeclaredThrowableException
        at $Proxy421.processRequest(Unknown Source)
        at
org.jboss.remoting.transport.servlet.web.ServerInvokerServlet.processRequest(ServerInvokerServlet.java:128)
        at
org.jboss.remoting.transport.servlet.web.ServerInvokerServlet.doPost(ServerInvokerServlet.java:157)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
... more ...
This error is normal and is related to the sequence that the server loads its classes when it starts. The remoting classes are loaded early in the startup sequence, which means the server begins attempting to contact agents before it is fully started, and this can cause the errors recorded in the logs. These errors should go away once the server is completely started.
The errors can be safely ignored.
Q:
How do I get debug messages from the JBoss ON server?
A:
You can edit the serverRoot/jon-server-3.1.0.GA1/jbossas/server/default/conf/jboss-log4j.xml configuration file to enable debug messages by uncommenting the org.rhq category. This will set its priority to DEBUG. Debug messages will now be emitted for all JBoss ON subsystems to the log file. If you want debug messages to be emitted only for a smaller subset of the JBoss ON server internals, you can specify which categories you want by uncommenting them, or alternatively, you can add your own categories.
There are several commented-out categories in log4j.xml with comments that briefly explain what types of debug messages can be expected from a particular category. You can also emit debug messages for third-party subsystems like JBoss/Remoting and Hibernate. Some of these are already commented out in log4j.xml.
After you make your changes to the log4j.xml file, save the file and then restart the JBoss ON server.
serverRoot/jon-server-3.1.0.GA1/bin/rhq-server.sh|bat stop
serverRoot/jon-server-3.1.0.GA1/bin/rhq-server.sh|bat start
Debug messages are in the log file, serverRoot/jon-server-3.1.0.GA1/logs/rhq-server-log4j.log.

NOTE

By default, the console window will not display the debug messages. This is because the log4j CONSOLE appender has a threshold at INFO. If you want your debug messages to also appear on the console, you must change the CONSOLE appender's threshold setting to DEBUG.
In some cases, you will want debug messages from the JBoss ON server launcher scripts. To do this, you need to set the environment variable RHQ_SERVER_DEBUG to any value. After setting this variable when you start the launcher, scripts will output debug messages.
Q:
How can I specify command-line options for the server JVM?
A:
On Red Hat Enterprise Linux, override the default max heap and permgen sizes, set them via the RHQ_SERVER_JAVA_OPTS environment variable. For example:
 
RHQ_SERVER_JAVA_OPTS="-Dapp.name=rhq-server -Xms256M -Xmx1024M 						   
-XX:PermSize=128M -XX:MaxPermSize=256M 				
-Djava.net.preferIPv4Stack= true" 
 export RHQ_SERVER_JAVA_OPTS


Set all other JVM options via the RHQ_SERVER_ADDITIONAL_JAVA_OPTS environment variable. For example:
 RHQ_SERVER_ADDITIONAL_JAVA_OPTS= "-Dfoo= true" 
 export RHQ_SERVER_ADDITIONAL_JAVA_OPTS

On Windows, for all other JVM options, add wrapper.java.additional.n lines to <server-install-dir>\bin\wrapper\rhq-server-wrapper.inc (you may need to create the file). For example:
  • wrapper.java.additional.12=-verbosegc:file=gc-log.txt
  • wrapper.java.additional.13=-XX:+HeapDumpOnOutOfMemoryError
  • wrapper.java.additional.14=-XX:HeapDumpPath=heap-dump.txt
Q:
How do I purge my schema of all data?
A:
There are instances where it's necessary to completely purge the database schema of all data. This is helpful when writing custom plug-ins and a lot of the resource hierarchy information and metadata needs to be replaced. To delete all the data from the database but keep the schema intact, simply re-install the server:
  1. Save the current JBoss ON server directory.
    mv jon-server-3.1.0.GA1/ jon-server-3.1.0.GA1.bak/
  2. Unzip the latest JBoss ON binaries.
    unzip jon-server-3.1.0.GA1.zip
  3. Start the new server process.
    serverRoot/jon-server-3.1.0.GA1/bin/rhq-server.sh start
  4. Open the JBoss ON GUI and go through the installation setup. When given the choice, select the option to Overwrite existing data. This removes all of the data for the previous installation of the server.
Q:
How can I debug JDBC access and trace SQL?
A:
You can debug JDBC and access and trace SQL using log4jdbc.
Q:
How can I confirm my server's email/SMTP settings are correct?
A:
To check that the server can send emails successfully, log into the GUI as the rhqadmin user and open the email test page:
http://server.example.com/admin/test/email.jsp
Q:
My server machine does not have a writable directory called /var/run. How can I get my rhq-server.sh script to successfully write out its pid file?
A:
Set the environment variable RHQ_SERVER_PIDFILE_DIR to the full path of the directory where you want the pid file to be stored. When you run the script, that variable's value will override the default location. If you have a script that is 2.1 or older, directly edit rhq-server.sh and change /var/run to the desired directory.
Q:
When I try to start the server, I get an exception with the cause "Exception creating identity" and the server fails to start. How can I fix this?
A:
The message you are referring to probably looks similar to this:
Caused by: java.lang.RuntimeException: Exception creating identity: my.host.name.com: my.host.name.com
|         at org.jboss.remoting.ident.Identity.get(Identity.java:211)
This is not JBoss ON-specific. It is caused by a failure with JBoss/Remoting communications. The core issue is typically because your hostname is not resolvable. The issue is normally hidden from you because JBoss/Remoting isn't producing the real error message. This error normally indicates that a machine's hostname is not externally resolvable. In order for JBoss ON to work correctly, all servers and agents must be able to resolve each other's hostnames. Best practice is to maintain a mapping of all servers and agents by using host files (e.g. /etc/hosts). This will ensure that JBoss ON will continue to work correctly even if DNS fails. However, using host files may not be practical for your environment. If this is the case, please take some time before you begin your JBoss ON installation to verify that each host you plan to run JBoss ON on can correctly resolve every other hostname in your planned environment using a tool such as nslookup.

NOTE

This applies even if you use IP addresses exclusively for configured values, as the server and agents perform host lookups for certain functions.
Q:
My server logs are showing the message "Have not heard from agent ... Will be backfilled since we suspect it is down." What does that mean?
A:
[org.rhq.enterprise.server.core.AgentManagerBean] Have not heard from agent [agent_name]
since [timestamp]. Will be backfilled since we suspect it is down

This means that the agent did not send its availability report in the required amount of time. The default is 2 minutes, but you can configure this on the Administration > System Configuration > Settings page. When the availability report is not sent in the required amount of time, the server assumes the agent is down. At this time it back-fills the availability of all resources managed by that agent to DOWN and the resource availabilities turn red.
This can happen for a number of reasons:
  1. The agent actually shut down or crashed.
  2. The machine the agent is running on shut down or crashed.
  3. The network between the agent and server went down, prohibiting the agent from connecting to the server and sending the availability report.
  4. The machine the agent is running on is bogged down, thus slowing up the agent and prohibiting the agent from being able to send up reports fast enough.
Q:
What ports do I have to be concerned about when setting up a firewall between servers and agents?
A:

NOTE

These are the default values. Different values can be configured for JBoss ON servers or agents when they are installed.
The default server ports are 7080 (standard) and 7443 (secure SSL).
The default agent port is 16163 for both standard and secure connections.
The server also has to communicate with its database. The default port depends on the type of database.
Q:
I installed the server as a Windows service, but it is failing to start with no error messages. How can I start the server as a Windows service?
A:
You probably installed the server to run as the local system account and that account probably doesn't have the proper permissions to run the server or machine has been locked down due to security concerns and that local system account cannot access the network or run Java.
To solve this, create a user on your Windows box that can run the server properly. To test the user permissions, log in as the user and execute rhq-server.bat console to see if it can be run by that user. Then, install the server as a Windows Service with the RHQ_SERVER_RUN_AS_ME environment variable set to true:
rhq-server.bat remove
set RHQ_SERVER_RUN_AS_ME=true
rhq-server.bat install

Q:
How do I fix an ORA-12519, TNS:no appropriate service handler found error when using Oracle XE?
A:
Although Oracle XE is not supported for production environments, some places use it for test or development environments. To stop the ORA-12519 error, set this setting:
ALTER SYSTEM SET PROCESSES=150 SCOPE=SPFILE;
Then restart the Oracle XE database.
Q:
I am seeing this error in my server logs or stack trace: WARN [QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory. What does that mean and what is causing it?
A:
This error is issued by the Hibernate service and can be triggered for a number of different reasons. This error can be ignored.
Q:
How do I stop the server from periodically logging messages that say a plug-in is "the same logical plug-in" but has "different content" and "will be considered obsolete"?
A:
This is a known issue, in Bugzilla 676073. To work around it, shutdown the server, remove the plug-in jars from the server's filesystem, and restart the server.
Q:
What is the difference between LDAP user authentication and LDAP group authorization in JBoss ON?
A:
Authentication is the process that is used to verify that an entity attempting to access a resource is the identity that it is claiming to be. Authorization is the process of determining what rights an entity has to access a resource after its identity has been established. Let's say that a user named jsmith is attempting to log into JBoss ON. Authentication is the process of checking that the jsmith trying to log in is the same as the jsmith user that JBoss ON has in its database; this can be validated by verifying the password. Once jsmith logs in, then JBoss ON determines what resources jsmith can view and whether he can edit those resources' configuration, provision new applications, change server settings, and perform other tasks in JBoss ON.
Typically, JBoss ON uses its own user database to identify and authenticate users. It is possible to enable LDAP authentication by letting JBoss ON check an LDAP server for user information first, and using that base of users for valid JBoss ON users. This is LDAP authentication. This is essentially pass-through authentication. A user attempts to log into JBoss ON. JBoss ON first sends the credentials to the LDAP server to see if the LDAP server has stored that user; if authentication fails at the LDAP server, then JBoss ON checks its own database.
All authorization in JBoss ON is based on roles. Both users and resource groups are added to roles, and then permissions are assigned to those roles. These roles are created and managed in JBoss ON, but it is possible to use group membership in an LDAP group to supply the users in JBoss ON role. Essentially, this takes an existing list of users in LDAP and just says, "use this list for the role members." The LDAP group is added to a JBoss ON role, and then every member in that LDAP group automatically has whatever rights the role has. That is LDAP authorization.

NOTE

LDAP authentication is recommended for LDAP authorization, but it is not required.
Q:
How do I set up LDAP group authorization?
A:
LDAP authorization is set up in the Administration tab, under System Settings.
First set up JBoss ON to allow LDAP users to authenticate using LDAP user accounts ("Configuring LDAP User Authentication"). (LDAP authentication isn't required, but it is recommended.) Then, configure JBoss ON to check for LDAP groups on the LDAP server ("Associating LDAP User Groups to Roles in JBoss ON").
There are five elements in the LDAP server configuration that you need to know to configure LDAP group authorization:
  • The information to connect to the LDAP server, in the form of an LDAP URL. For example, ldap://server.example.com:1389.
  • The username and password to use to connect to the server. This account should have read access to the subtrees being searched.
  • The search base. This is the point in the directory tree to begin looking for entries. This should be high enough to include all entries that you want to include and low enough to improve performance and prevent unwanted access. For example, if you have ou=Web Team,dc=example,dc=com and ou=Engineering,dc=example,dc=com and you want to include groups in both subtrees in JBoss ON, then set the search base high up the tree, to dc=example,dc=com. If you only want the engineering groups to be used by JBoss ON, then set the search base to ou=Engineering,dc=example,dc=com.
  • The group filter. This creates the search filter to use to search for group entries. This can use the group object class, which is particularly useful if there is a custom attribute for JBoss ON-related entries. This can also point to other elements — like the group name, a locality, or a string in the entry description — that are useful or meaningful to identify JBoss ON-related groups.
  • The member attribute. There are different types of group object classes, and most use different attributes to identify group members. For example, the groupOfUniqueNames object classes lists its members with the uniqueMember attribute.
After LDAP authorization is enabled, then you can associate the roles in JBoss ON to the appropriate groups in the LDAP directory.

5. Agent

Q: I have a physical machine hosting multiple virtual machines with shared disk resources. How can I run an agent on each virtual instance?
Q: How do I get debug messages from the JBoss ON agent?
Q: How do I restrict which agents are allowed to connect to the server?
Q: Do I have to run the agent as root?
Q: How do I clean start the JBoss ON agent, as if newly installed?
Q: How can I do a "clean config" for an agent running as a background Windows service?
Q: How can I update the plug-ins on all my agents?
Q: How can I change the agent name after it has already been registered?
Q: I want to run agents on all my machines, but only one starts OK. The rest fail due to binding to a wrong address.
Q: When starting the agent via a Windows service, the agent fails to start, and I see the error "java.lang.IllegalStateException: The name of this agent is not defined - you cannot start the agent until you give it a valid name" in the agent wrapper log file. What does this mean?
Q: My agent setup is correct but my agent is getting the following message: "Cause: org.jboss.remoting.CannotConnectException: Can not connect http client invoker."
Q: My agent machine does not have a writable directory called /var/run. How can I get my rhq-agent-wrapper.sh script to successfully write out its pid file?
Q: How often does the agent scan for resources?
Q: How can I view the agent's persisted configuration?
Q: How can I find out what environment variables and Java system properties are set in my agent JVM process?
Q: How can I get a dump of inventory information from an agent running on another machine?
Q: I need to change the IP address of my agent machine. How do I keep my server and agent up to date with that change?
Q: How can I stop my agent from thinking the server keeps going up and down when the server has remained running the whole time?
Q:
I have a physical machine hosting multiple virtual machines with shared disk resources. How can I run an agent on each virtual instance?
A:
You can run multiple agents on the same box, but the agents must run out of two different agent installation directories.
Q:
How do I get debug messages from the JBoss ON agent?
A:
The easiest and quickest way to get your agent to start logging debug messages is, before starting your JBoss ON agent, to set the environment variable RHQ_AGENT_DEBUG to any value. When you start the agent, both the launcher scripts and the agent itself will output debug messages. When you use this environment variable, the agent will use an internal log4j configuration file.
For more fine-grained control over what log4j categories have DEBUG priority, edit the conf/log4j.xml file and restart the agent to load the changes. Do not set RHQ_AGENT_DEBUG if you want the agent to use the log4j.xml file; setting that environment variable causes the agent to override this log4j.xml with an internally configured log4j configuration.
The log messages can be found in the log files located in the agentRoot/rhq-agent/logs directory.
If you are launching the JBoss ON agent on Microsoft Windows using the service wrapper, you must set RHQ_AGENT_DEBUG and then install the service:
rhq-agent-wrapper.bat install

Q:
How do I restrict which agents are allowed to connect to the server?
A:
Change the server's JBoss/Remoting servlet configuration to restrict the IPs agent requests can come from. If the agents are on a specific subnet, then connections can be restricted to only that subnet.
  1. Create a file for the restriction rule, with this name and location:
    vim serverInstallDir/jbossas/server/default/deploy/rhq.ear/jboss-remoting-servlet-invoker-2x.r3040.jon.war/WEB-INF/context.xml
  2. Add this content to the file:
    <?xml version="1.0" encoding="UTF-8"?> 
    <Context> 
    <Valve className="org.apache.catalina.valves.RemoteAddrValve" 
    allow="192.168.*,142.104.128.*,10.224.27.182"/> 
    </Context>
    The allow= attribute lists the IPs that are allowed to connect to the server. All other IPs are blocked.
Q:
Do I have to run the agent as root?
A:
You do not have to run the agent as root.
However, that said, some resource types have very strict limits on what users can access their configuration files and processes, and running the agent as root may be the only way to manage or monitor those resources.
For example, a PostgreSQL plug-in lets the agent probe the PostgreSQL configuration file, postgres.conf. Running the agent as a non-root user without PostgreSQL privileges means that the agent cannot read and manage the file. (And there will be log messages in the agent log saying so.) There are other resources that have similarly privileged files, like iptables and even some JBoss servers.
Running the agent as root gives the agent privileges to manage all those things. Without that privilege level, the agent has restricted views of managed resources.
Q:
How do I clean start the JBoss ON agent, as if newly installed?
A:
There are three points of configuration for the agent: the agent's (local) persisted configuration, the agent inventory (and associated resource data), and the platform entry in the server inventory.
To clean the agent configuration and restart the agent as if it were new, then there are two steps to take:
  1. Remove the platform entry from the JBoss ON server inventory. Since the platform entry is representative of the agent entry, this effectively removes the agent from the JBoss ON topology.
  2. Stop the agent and then restart it using the --fullcleanconfig (-L) command-line option.
    agentRoot/rhq-agent/bin/rhq-agent.sh --fullcleanconfig
    The --fullcleanconfig option removes all of the local inventory for the agent, reloads its configuration fresh from the agent-configuration.xml file, and re-registers the agent with the server.
    Optionally, pass the --config argument to have it start up with a user-specified configuration file. Otherwise, the default conf/agent-configuration.xml file is used. If no directory is given, then the command looks for the configuration file in the agent's conf/ directory.
    agentRoot/rhq-agent/bin/rhq-agent.sh --fullcleanconfig -c my-agent-configuration.xml
Q:
How can I do a "clean config" for an agent running as a background Windows service?
A:
The JBoss ON agent Windows service (like all Windows services) runs as a specific user. Go to that user's Windows registry and delete the agent's configuration node. The RHQ Agent uses the standard Java Preferences API, so the agent's configuration is stored as a node under the normal Java Preferences location in the Windows registry, such as HKEY_CURRENT_USER\Software\JavaSoft\Prefs\rhq-agent\default. ("default" is the name of the preferences node.)
Deleting the node only removes the previous configuration; the agent has to be reconfigured before it can be started again. The simplest path is to start the agent in the foreground and go through the interactive agent configuration, then stop that session and start the agent as a service.
Alternatively, the agent can be forced to read its configuration anew from the agent-configuration.xml file. To force the agent to re-read its configuration from file, you won't be able to start it in the foreground, which makes re-configuring it a little bit more difficult.
If the agent has been added as a JBoss ON resource, you can invoke the "Execute Command Prompt" operation and run the config --import agent-configuration.xml command.
Alternatively, you can edit the rhq-agent-wrapper.conf file and add a line for a third parameter:
wrapper.app.parameter.3=--cleanconfig
This forces the agent to re-read its configuration from the agent-configuration.xml every time it is started as a service. In this case, the agent-configuration.xml must be preconfigured with all of the required (and optional) settings for the agent, so that it restarts with the correct configuration.
Q:
How can I update the plug-ins on all my agents?
A:
When you add a new plug-in to your system or upgrade an existing plug-in, you can instruct your agents to update their existing plug-ins with the new plug-in versions.
You can do this individually by executing the prompt command plugins update at any agent prompt or through the UI with the Update All Plugins task in the agent's OPERATION tab.
To update all of agents with the latest plug-ins, launch a group operation on the JBoss ON agents autogroup. First, create an autogroup by navigating to the Browse Resources page. Then click Group Definitions > New Definition button. The autogroup definition should have the expression:
resource.resourceType.pluginName = RHQAgent
resource.resourceType.typeName = RHQ Agent

This creates a compatible group that dynamically adds all JBoss ON agents as members to that group.

NOTE

If you already have a compatible group with your agents as members, you can skip this group creation step.
Next, navigate to the agent group. In the OPERATIONS tab, invoke the Update All Plugins operation. This tells all of your agents in that group to update their plug-ins. Once that group operation is completed, all of your agents will have up-to-date versions of all plug-ins.
Q:
How can I change the agent name after it has already been registered?
A:
When you start the agent for the first time, you are asked in the setup screen for an agent name. This name must be unique across all agents in your environment. Once it is registered you cannot change this name. If you attempt to re-register this agent, you must re-register it with the same name that it was registered with before.
The agent name is not the same as the JBoss ON agent resource name in the UI. If you import an JBoss ON agent resource into inventory, that resource's name is agent_name JON agent. This JBoss ON agent resource name can be changed by editing its value within the INVENTORY tab. Changing this name does not change the name that the agent is registered under. Your agent is still registered under its original agent name.
Q:
I want to run agents on all my machines, but only one starts OK. The rest fail due to binding to a wrong address.
A:
There are a couple of things that can cause this error.
FATAL [main] (org.jboss.on.agent.AgentMain)-
{AgentMain.startup-error}The agent encountered an error during startup
and must abort java.net.BindException: Cannot assign requested address


First, did you change the agent-configuration.xml manually (to change IP addresses, for example) after setting up the agent? The agent's configuration XML file is not referenced after the agent is setup because its configuration is persisted using Java Preferences. Persisting the configuration allows the agent to be updated or re-installed without losing its configuration. To change the agent's configuration file and have those changes picked up, restart the agent and pass the --config command line option (or -c which is shorthand for --config). This tells the agent to re-read the configuration file and override any old configuration it persisted before.
If your home directory is stored on NFS, then you are probably picking up the same Java preferences across all machines ($HOME/.java on Red Hat Enterprise Linux). This is usually not an issue on Windows since the Java preferences are stored in the registry. If you are running the agents as the same user and your user's home directory is shared, then have the agents use different Java preferences names. Edit your agents' agent-configuration.xml files and change their Java preferences node names from default to something that makes them unique across all agents. For example:
<node name="NewName">


Since you overrode the default location, every time you start your agent you need to tell the agent where it can find its preferences. You tell the agent the new preference name using the --pref option. Since you changed the configuration file, restart the agent with the -c to specify the configuration file.
agentRoot/rhq-agent/bin/rhq-agent.sh --pref NewName -c agent-configuration.xml

Subsequently, always restart the agent with the --pref option to pass in the node name.
Alternatively, define the system property java.util.prefs.userRoot to point to another, unique, location for preference. When the agent starts, Java will use the value of that system property as the location where it will store its Java Preferences. You can set this system property on the agent by setting the RHQ_AGENT_ADDITIONAL_JAVA_OPTS environment variable. When you set that environment variable, rhq-agent.sh will add its value to the default set of Java options when passing in options to the agent's Java VM:
set RHQ_AGENT_ADDITIONAL_JAVA_OPTS="-Djava.util.prefs.userRoot=/etc/rhq-agent-prefs"
agentRoot/rhq-agent/bin/rhq-agent.sh

Q:
When starting the agent via a Windows service, the agent fails to start, and I see the error "java.lang.IllegalStateException: The name of this agent is not defined - you cannot start the agent until you give it a valid name" in the agent wrapper log file. What does this mean?
A:
The agent cannot ask for its initial setup configuration when installing as a Windows service because there is no console for the user to see and answer the prompts. This means that you need to either pass the information through a preconfigured file or run the agent in standard, non-service, mode once as the user that should run the service to configure it before installing it as a service.
Q:
My agent setup is correct but my agent is getting the following message: "Cause: org.jboss.remoting.CannotConnectException: Can not connect http client invoker."
A:
This error is typically seen when the server's endpoint address is not set to something that can be resolved by the agent. The public endpoint address set for each server must be resolvable by every JBoss ON agent because of the high availability cloud configuration of JBoss ON servers.
Check your server endpoint information in the high availability pages in the GUI and change the settings if necessary. After the update, restart the agent.
Q:
My agent machine does not have a writable directory called /var/run. How can I get my rhq-agent-wrapper.sh script to successfully write out its pid file?
A:
Set the environment variable RHQ_AGENT_PIDFILE_DIR to the full path of the directory where you want the pid file to be stored. When you run the script, that variable's value will override the default location. If you have an older script (2.1 or older), directly edit rhq-agent-wrapper.sh and change /var/run to the desired directory.
Q:
How often does the agent scan for resources?
A:
When an agent is installed, it scans the platform, and all applications on it, for any servers, services, or other items which can be included into the inventory. The process of finding potential resources is discovery.
There are different scans for each type of resource: platform, server, and service. High level scans for servers and platforms are initiated by the agent every 15 minutes. A service scan detects lower-level services that are running in servers that have already been imported into the inventory. These scans run by default every 24 hours. Both of these intervals are configurable.

NOTE

A server must be imported into the inventory before any of its child processes, servers, or services can be detected by the discovery scan.
Q:
How can I view the agent's persisted configuration?
A:
The agent's configuration is initially read from agent-configuration.xml and overlaid with the value given at the agent setup. After the agent is initially configured, it persists that configuration and never refers back to agent-configuration.xml, unless you clear the configuration.
There are several ways to view the agent's persisted configuration:
  1. If the agent is in the JBoss ON inventory, simply go to your agent's Configuration tab to view its live configuration. This is the same configuration that is persisted.
  2. If the agent is currently running in non-daemon mode (i.e. you have the agent prompt on your console), you can use the getconfig or config prompt commands to view the live configuration. Type help getconfig or help config for more information.
  3. If the agent is in the JBoss ON inventory, run the Execute Prompt Command operation and invoke the getconfig prompt command.
  4. Because the agent configuration is stored in the standard Java Preferences API backing store, you can use any tool that can examine Java preferences, such as Google's Java Preferences Tool. This is a GUI tool that can give you a file system-like view into your Java preferences. The agent preferences are stored in the User preferences node under the node name rhq-agent. Depending on the -p option that is passed to the agent for its node name when it is started, the actual configuration settings are found under a sub-node under rhq-agent. The default preferences node is called default, so typically your agent's persisted configuration is found in the user preferences under rhq-agent/default.

WARNING

Do not attempt to change the values of the preferences using third-party tools without knowing what you are doing. This could disable the agent if you change the wrong preference to the wrong value. Use this mechanism only to view your agent's configuration.
Q:
How can I find out what environment variables and Java system properties are set in my agent JVM process?
A:
The version agent prompt command shows a list of the agent process' environment variables and system properties. version --sysprops provides a list of all the system properties, and version --env provides a list of all the environment variables. (At the agent prompt, run help version for the syntax of that command.)
Q:
How can I get a dump of inventory information from an agent running on another machine?
A:
If the agent inventory becomes corrupted, dumping the agent's inventory can help debug the problem.
To get this information, get the agent's data/inventory.dat file. Copy that file to the local machine. Then, run an agent on the local machine, with the same plug-ins as the other agent. The agent doesn't necessarily have to be connected to a server, but the plug-in container must be started, so the agent has to have been registered. Then, export the information from the imported DAT file.:
inventory --xml --export=/bad-inventory.xml /the/bad/inventory.dat
If you do not specify the --export option, the XML will simply be dumped to the stdout console window.
Now you have an XML file that describes what the customer's agent thinks is its inventory.
Q:
I need to change the IP address of my agent machine. How do I keep my server and agent up to date with that change?
A:
The agent has a configuration preference named rhq.communications.connector.bind-address which sets the value of the IP address the agent binds to when it starts its server socket (the thing it listens to for incoming messages from the server).
If you change the agent's IP address (and invalidate the old agent IP address), you have to do a couple things:
  1. Change the agent's configuration so that preference value is the same as the new IP address. Issue a setconfig prompt command on the agent prompt:
    setconfig rhq.communications.connector.bind-address=IP_address
    Do not change agent-configuration.xml; the changes will not take effect.
    If the agent is running in the background as a daemon process, shut it down with the script (rhq-agent-wrapper.sh|bat) and restart it.
  2. Restart the agent after editing its configuration.
Once the agent is restarted, it will use that new IP address.
Q:
How can I stop my agent from thinking the server keeps going up and down when the server has remained running the whole time?
A:
There can be errors like the following appears in the agent logs:
INFO (org.rhq.enterprise.agent.AgentAutoDiscoveryListener)- {AgentAutoDiscoveryListener.server-offline}
The Agent has auto-detected the Server going offline [InvokerLocator
[servlet://server:7080/jboss-remoting-servlet-invoker
/ServerInvokerServlet?rhq.communications.connector.rhqtype=server]] -
the agent will stop sending new messages 
...
INFO (org.rhq.enterprise.agent.AgentAutoDiscoveryListener)- {AgentAutoDiscoveryListener.server-online}
The Agent has auto-detected the Server coming online [InvokerLocator
[servlet://server:7080/jboss-remoting-servlet-invoker
/ServerInvokerServlet?rhq.communications.connector.rhqtype=server]] -
the agent will be able to start sending messages now
This means the agent has auto-detected, through the multicast detector, the server going down and then back up. This is different from the detection-via-polling, which is the second way the agent attempts to detect the server's status.
If the agent is erroneously detecting the server going up or down, it is possible that either the network does not support multicast traffic or the multicast network is acting abnormally. In either case, disable the agent multicast detector and have the agent instead rely on polling to detect changes in the server status.
To turn off the multicast detection, set the following agent preferences to false:
  • rhq.agent.server-auto-detection
  • rhq.communications.multicast-detector.enabled
Since you are disabling multicast detection, ensure that the polling detection feature is enabled, meaning the rhq.agent.client.server-polling-interval-msecs value is larger than 0, typically 60000. Otherwise, the agent will never be able to know when the server goes down.
Once you reconfigure the agent, restart the agent so the communications subsystem can detect the changes.

6. Log Messages

Q:
What are "Command failed to be authenticated" messages?
A:
Agents are assigned security tokens when they first register with the server. The token is one way an agent identifies itself with the server. If an agent does not identify itself with any token, or if it identifies itself with a wrong token, the server will deny access to that agent. The server will therefore reject commands that come from that agent until that agent has been properly registered.
02:31:33,095 WARN  [CommandProcessor] {CommandProcessor.failed-authentication}
Command failed to be authenticated! This command will be ignored and not processed:
Command: type=[identify]; cmd-in-response=[false]; config=[{}]; params=[null]
Failure to authenticate errors usually mean the agent has been misconfigured, or it is an unknown agent attempting to identify itself as another agent. Restart your agent with the --cleanconfig command line option to clean out its configuration and re-register.

NOTE

Do not rely on the security token mechanism as a way of protecting your JBoss ON environment from intrusion. Configure SSL for agent-server communications.
Q:
What are "fail-safe cleanup" messages?
A:
These messages can be ignored. These relate to the Hibernate services used by JBoss ON and how it automatically cleans up after itself to prevent memory leaks.
13:43:10,781 WARN [LoadContexts] fail-safe cleanup (collections) : 
org.hibernate.engine.loading.CollectionLoadContext@103583b 
<rs=org.postgresql.jdbc3.Jdbc3ResultSet@d16f5b>

7. Server and Agent Plug-ins

Q:
How can I extend JBoss ON?
A:
New plug-ins can be written to support other external projects and custom applications.
Q:
How can I write a plug-in for JBoss ON?
A:
Writing both agent and server plug-ins is covered in the Plug-in Writing Guide in the JBoss ON documentation set.
Q:
What is the skeleton plug-in module?
A:
To start writing custom plug-ins, begin with the custom-plug-in maven module skeleton, a template available with the JBoss ON source code. This includes a maven pom with defined dependencies, a starter set of resource components and a minimal rhq-plug-in.xml plug-in descriptor.

8. General Resource Questions

Q:
I deleted a Platform from inventory. How can I rediscover it, so I can re-import it?
A:
You can force an agent discovery by issuing the following command at the agent command prompt:
> discovery -f
Q:
On a Red Hat Enterprise Linux platform, interface "sit0" is discovered, but it is always red. How can I remove this interface from inventory?
A:
Because network adapters are child services, there is currently no way to tell JBoss ON to ignore them or not inventory them.
However, there is an easy way to disable this interface on the machine itself. If you have root or sudo access to the box, disable all IPv6 support.
Change the NETWORKING_IPV6 value in /etc/sysconfig/network:
NETWORKING_IPV6=no

WARNING

Disabling this interface will disable IPv6 support.
Then, edit /etc/modprobe.conf to include the following lines:
alias net-pf-10 off
alias ipv6 off
Stop the ipv6tables service:
service ip6tables stop
Disable the ipv6tables service:
chkconfig ip6tables off
Q:
How can I collect syslog messages as JBoss ON events?
A:
The Linux platform plug-in can monitor syslog messages by emitting them as events. Syslog messages can be collected by the plug-in by either reading syslog message files or by receiving them over a socket listener.
The syslog must be configured to format the messages in a way that JBoss ON can parse. You can either tell JBoss ON (in the platform's plug-in configuration or connection properties) what regular expressions can parse syslog messages, or format the messages in the syslog config file (/etc/rsyslog.conf) so that JBoss ON understands. For example:
$template RHQfmt,"%timegenerated:::date-rfc3339%,%syslogpriority-text%,%syslogfacility-text%:%msg%\n"
If you then use RHQfmt in the syslog configuration so it writes messages out in that format, JBoss ON understands the log messages fully. For example:
$template RHQfmt,"%timegenerated:::date-rfc3339%,%syslogpriority-text%,%syslogfacility-text%:%msg%\n"
*.* /var/log/messages-for-rhq;RHQfmt
*.* @@127.0.0.1:5514;RHQfmt
That both writes syslog messages to /var/log/messages-for-rhq and sends the messages over TCP to a listener on port 5514, as configured in the platform's connection properties.
Q:
Executing a script resource fails on Red Hat Enterprise Linux.
A:
If invoking the Execute operation on a script resource, it immediately fails with an error message saying that the script cannot be executed, then ensure that the script itself is executable. Set the script to execute:
chmod a+x scriptname

9. JBoss Resources

Q:
Why does only one JBoss AS server show green availability and all the rest show red, even though I made sure all of my JNP credentials are configured properly in my resources' connection properties?
A:
There is a problem in the way the JBoss AS JNP client works. If you are managing multiple JBoss AS servers on a single box, all of your security credentials for those servers must be the same (i.e. the JNP username and password must be the same).
Q:
When I import a server like JBoss EAP 5 or Tomcat, I see its child JVM resource in inventory, but it is red (DOWN). Why?
A:
If a server is started with JMX remoting enabled and secured, the agent cannot connect to the JMX server because it cannot detect the proper credentials.
For example, if the JMX server has these system properties:
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=5222
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=true
-Dcom.sun.management.jmxremote.password.file=/jmxremote.password
-Dcom.sun.management.jmxremote.access.file=/jmxremote.access
The agent's JMX plug-in examines the command line for the JMX server's process, sees that the JMX server is remoted and secured, and tries to set up its secure, remote JMX connector. Because the agent does not have the appropriate credentials, it cannot connect to the remote JMX MBean server and assumes it is in a DOWN state.
Edit the resource's Connections Settings, under the resource's Inventory tab, to enter the valid username and password set in the JMX remote access files. This enables JBoss ON to go through the remote JMX endpoint.
Alternatively, JBoss ON can connect to the parent resource, and then use that to connect to the JMX server. In that case, in the Connection Settings subtab, unset all of the connection properties except for the Type property, which should be set to Parent. The parent of the JVM, the JBoss EAP resource, can provide the information to connect to the JVM.
Q:
When trying to monitor a JBoss EAP instance, I get the error "Connection failure Failed to authenticate principal=null, securityDomain=jmx-console."
A:
As explained in the JBoss EAP documentation and the JBoss EAP 4.3 documentation, the jmx-console is secured by default. Define the username and password as instructed in the EAP documentation. Then add this username and password under the Inventory > Configuration Properties page of the JBoss EAP instance.

NOTE

Starting a JBoss EAP instance without specifying a configuration parameter (-c) starts the instance in production configuration.
Q:
When monitoring a JBoss AS instance, I'm not seeing any JVM resources beneath it.
A:
For JBoss ON to discover JVM resources for a JBoss AS resource, the corresponding JBoss AS instance needs to be running Java 5 or later. It also needs to have been started with the jboss.platform.mbeanserver system property set. For example, in Red Hat Enterprise Linux, the ${JBOSS_HOME}\bin\run.conf file should have this setting:
JAVA_OPTS="$JAVA_OPTS -Djboss.platform.mbeanserver"
Q:
Can I monitor JBoss AS 5.1?
A:
No. There are problems in the JBoss AS 5.1 profile service which prevent the agent from discovering it.
You can monitor JBoss EAP 5.0 and later versions and JBoss AS 6.0.
Q:
My agent can detect my JBoss server and gets its connection properties, but the JNP connection fails. Why?
A:
This primarily happens on Windows if the agent is installed in a directory with spaces in the pathname, such as C:\Program Files.
If the agent can detect the JBoss server and can find all of its connection properties, then check the logs for failures to connect to the profile service:
2012-01-12 15:03:38,982 DEBUG [ResourceContainer.invoker.daemon-1] (org.rhq.plugins.jbossas5.ApplicationServerComponent)- Failed to connect to Profile Service.
java.lang.RuntimeException: Failed to lookup JNDI name 'ProfileService' from InitialContext.
	at org.rhq.plugins.jbossas5.connection.AbstractProfileServiceConnectionProvider.lookup(AbstractProfileServiceConnectionProvider.java:84)

[snip]
Move the agent to a directory without spaces in the pathname and then re-discover the JBoss resources.

10. Postgres Resources

Q:
Why is the agent showing an error in my PostgreSQL discovery about authentication failed for user "postgres"?
A:
The Postgres plug-in attempts to log into the database server using the username and password of postgres. In many installations, this is a default superuser and will work. However, it is also possible that this login could fail for several different reason:
  • The postgres user has been deleted.
  • The password for the postgres user has been changed.
  • On Linux, the administrative login has been set to ident sameuser.
To resolve this:
  • Inventory the discovered Postgres resource. Its availability will show as down and it will not find any child resources.
  • Navigate to the INVENTORY tab for the Postgres resource.
  • Under Connection Properties, click the Edit button.
  • Change the role name and password fields to reflect a valid super user account on the Postgres instance.
Additionally, Postgres may need to be changed on Red Hat Enterprise Linux systems to allow password based logins by changing settings in the pg_hba.conf file.
Q:
Why are most of the metrics for my Postgres resource showing up as NaN?
A:
In many installations, Postgres will not start its statistics collector by default. To enable statistics collection, add or change the following line in the postgres.conf file:
stats_start_collector = on
Q:
How many database connections are necessary to monitor a Postgres database?
A:
Each Postgres database inventoried in JBoss ON requires one connection.
Q:
Why can't I drop my database that is inventoried in JBoss ON?
A:
With the frequency of availability and statistics monitoring, the Postgres plug-in keeps an open connection to the database. When attempting to drop a database currently inventoried in JBoss ON, an error is thrown about the database being in use. To drop the database, the JBoss ON agent monitoring the database must be shut down or the database resource should be removed from JBoss ON. This will close the postgres plug-in's connection to the Postgres server and allow you to drop the database.

11. Apache Resources

Q:
Where can I get the Apache connectors?
A:
The Apache plug-in monitors an Apache Web server through custom modules like the SNMP connector. The connectors can be downloaded from the JBoss ON server on the Downloads page in the GUI.
Q:
I have instrumented Apache with the Response Time module, but no RT metrics are being shown for my VirtualHosts.
A:
If there are no RT metrics being displayed for your VirtualHosts there are two things you should check:
  1. Can the agent's system user read the _rt log files? For instance, under RHEL/Apache, the default permissions of /var/log/httpd are 700, root:
    ls -arltd /var/log/httpd/
    drwx------ 2 root root 4096 Jul 28 11:36 /var/log/httpd/
    A workaround is to specify an alternate log directory for the httpd logs or, alternatively, to change the permissions of /var/log/httpd. Both of these have specific security implications. You could also run the agent as root; while this is the least preferable option, there are cases where this is necessary in order to not compromise system security by modifying file permissions. For example, JBoss ON cannot monitor the postgres daemon without root permissions, due to 700, postgres permissions on its data directory. Since those permissions shouldn't be altered, the only remaining option is to run the agent as root.
  2. Have you enabled the Response Time Metric for the Apache Vhost Template and enabled Response Time? It is disabled by default.
To do this, go to:
  1. Administration > System Configuration > Templates | Apache HTTP Server > Apache Virtual Host, and click Edit Metric Template.
  2. Select the checkbox next to HTTP Response Time.
  3. At the bottom of the page, select Update schedules for existing resources of marked type.
  4. Set the collection interval.
  5. Click the Go button ([>]).

Note

The RT metrics will now work. It may take approximately 10 minutes for the metrics to appear.
Q:
Some of my Apache metrics show values of zero. Why?
A:
Three metrics show values of zero when you are monitoring with the SNMP module:
  • Bytes Received for GET Requests per Minute
  • Bytes Received for POST Requests per Minute
  • Total Number of Bytes Received per Minute
This is because of how SNMP interprets information from the request body. First, SNMP provides various length values for the request body and a GET request does not have a body, so GET responses are not calculated and, therefore, have a value of zero. Second, Apache does not calculate a request body size if there is request chunking.
Q:
What is the Augeas plug-in?
A:
The Augeas plug-in is an abstract plug-in that exists solely as an extension point for other plug-ins. The Augeas plug-in provides the Java JNI classes necessary for other dependent plug-ins to use to access the Augeas native library. For example, the OpenSSHD plug-in depends on the Augeas plug-in because it uses the Augeas library to access the OpenSSH daemon configuration. Several other JBoss ON plug-ins use this Augeas plug-in:
  • hosts
  • grub
  • apt
Q:
Why does my agent log have the error "java.lang.UnsatisfiedLinkError: Unable to load library 'augeas': libaugeas.so: cannot open shared object file: No such file or directory"?
A:
This means that an Augeas-based plug-in has been deployed, but the Red Hat Enterprise Linux box does not have the Augeas native library installed. See http://augeas.net for more information on installing Augeas libraries.
Q:
Why does my Apache SNMP module fail to start with an error?
A:
There are a couple of common errors and causes.
Error Cause
Syntax error on line 1376 of /etc/httpd/conf/httpd.conf: Unable to write to SNMPvar directory" (on stderr) Ensure the directory specified via the "SNMPVar" directive exists and is writable by the user that owns the Apache process.
init_master_agent: Invalid local port (Permission denied)" (in the error_log file)
See if your Apache error_log contains a log message similar to the following:
[notice] SELinux policy enabled; httpd running as context user_u:system_r:httpd_t:s0
This means the SELinux (Security-Enhanced Linux) policy is preventing the httpd process from binding to the SNMP agent port, 1610 by default. To resolve the problem, change SELinux to permissive mode by running the command /usr/bin/setenforce 0 and then restarting Apache. You should then see a message similar to the following in your error_log:
[notice] SELinux policy enabled; httpd running as context user_u:system_r:unconfined_t"
This message has the term unconfined_t. This indicates SELinux is no longer restricting the process.

12. Tomcat Resources

Q:
I get the error "This resource's configuration has not yet been initialized" when I go to the Configuration tab for a Tomcat resource. Why?
A:
Specifically, the error occurs when going to the TomcatJVMLogging resource and attempting to open the Configuration tab.
This means that configuration management has not been enabled for the Tomcat resource. This can be done by going to the Tomcat server's Inventory tab, opening the Connections subtab, and enabling configuration management explicitly.

13. Provisioning and Content

Q:
When I try to create a bundle by uploading a Ant recipe XML directly, the XML content seems to get corrupted and tags are placed out of order.
A:
If you file upload a ANT script as the recipe, you can't use XML notation like <property name="a" />. It has to have an explicitly defined closing tag, like <property name="a"></property>. If you don't want to revise your Ant script XML file, copy and paste the recipe directly into the text field instead of uploading the file.
Q:
What does the JBoss Patch Content feature of JBoss ON actually do? Is it completely automated?
A:
The patch process updates existing jar/class files with upgraded jar/class files that are contained in the patch zip. Some changes may need to be completed manually, such as any Not Performed steps. If your configuration does not include one of the jars to be patched, then that step is skipped.
The patching process does not explicitly care about what the server configuration profile is called or which base configuration it is derived from.
Patching JBoss servers, and other methods of streaming content to JBoss ON resources, is covered in the Basic Admin Guide.

14. Alerts

Q:
I just created an alert definition, and I know that my agent reported data that should have fired an alert immediately. But I don't see an alert. Why not?
A:
After an alert definition is created, it takes a few seconds to be inserted into the JBoss ON server alert caches and then propagated throughout the JBoss ON server cloud. An alert won't be fired until that alert definition is in the server alert cache.
When the alert definition is inserted into the cache, a message is recorded in the JBoss ON server logs:
INFO  [CacheConsistencyManagerBean] localhost took [51]ms to reload global cache
INFO  [CacheConsistencyManagerBean] localhost took [49]ms to reload cache for 1 agents
It generally takes around 30 seconds for an alert definition to be added to the cache. Wait at least a minute after creating a definition before checking if it fires an alert.
Q:
Why do I see alerts triggered on different metric values on different alert definition conditions when they are using the same metric?
A:
This can occur due to the nature of how alert conditions are processed when measurement data comes in from the agent. This happens if a single alert definition has multiple conditions that use the same metric and that alert definition uses the "ALL" conjunction. For example, if an alert definition has one condition for "alert if metric X is greater than 5" and a separate condition for "alert if metric X less than 10."
The alert condition range works around this by doing range checking. For more information, see Bugzilla 735262.

15. Monitoring

Q:
Why does the Events tab not capture the events of one of the "Log Event Sources"?
A:
When you define an event source you need to ensure three things:
  • It is enabled
  • The path to the log is valid; and
  • The selected date format matches what you have in the logs
If you select the default date format when creating the event source, and that format does not match what you have in the logs, nothing is captured for the log events.
For example, if you select the default date format 2012-07-14 15:36:25,075 and the logs have 2012.08.27-09:46:34, then no log event is captured.
Specify a different date format according to java.text.SimpleDateFormat.
Q:
When do baselines auto-calculate?
A:
Go to the Administration page of the JBoss ON GUI and click the Server Configuration link. You will see settings for Automatic Baseline Configuration Properties. Baseline Frequency determines how often the baselines are calculated. The default is 3 days. This means that every 3 days a new set of baselines are calculated, except for those that were manually set by the user. These remain pinned to the baselines set by the user.
Baseline Dataset determines the minimum set of data that must have been collected for a measurement before a baseline for that measurement is calculated. The default is 7 days. For example, when it is determined that baselines should be calculated, every third day by default, only those measurements that have data that is 7 days old or older will have a baseline calculated. Any measurements that do not have data from 7 days ago are skipped. This ensures that when a measurement's baseline is calculated, you have a good representative set of data to include in the calculation (e.g. by default, you will have 7 days worth of data that are included in the baseline calculation).

16. Operations

Q:
I clicked the Operations tab, but I don't see any available operations, and it says "No items to show." How do I schedule an operation?
A:
The Schedules tab shows a list of scheduled operations, meaning operations which are configured but have not yet been run. If there are no scheduled operations, then the tab has a description that reads No items to show.
That does not mean that there are no operations available for the resource; it only means that no operations have been scheduled.
To schedule an operation, click the New in the lower left corner of the Operations tab, and fill in the operation information.