JBoss ON server automatically switches to maintenance mode if it is unable to reach a storage node in its configured storage cluster

Solution Unverified - Updated -

Environment

  • Red Hat JBoss Operations Network (ON)
    • 3.2
    • 3.3
  • JBoss ON server mode has been automatically set to MAINTENANCE
  • JBoss ON storage node is unreachable from the impacted JBoss ON server

Issue

  • Server switches to Maintenance Mode
  • Server had switched to Maintenance Mode
  • Server mode automatically switches to maintenance status
  • JON Server mode automatically switches to maintenance status and server.log shows below WARN message :
16:23:02,613 WARN  [org.rhq.enterprise.server.storage.StorageClientManager] (EJB default - 1) Storage client subsystem wasn't initialized because it wasn't possible to connect to the storage cluster. The RHQ server is set to MAINTENANCE mode. Please start the storage cluster as soon as possible.: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: jon.localhost.com/10.11.12.3([jon.localhost.com/10.11.12.3] Cannot connect))
    at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:185) [cassandra-driver-core-1.0.8-jboss-1.jar:]

Resolution

Ensure that the JBoss ON server is able to reach at least one storage node in the storage cluster. After the storage node is once again reachable from the affected JBoss ON server, it will automatically switch from maintenance to normal operation mode.

NOTE: If the server was put into maintenance mode by any other method, you will need to manually switch it from maintenance to normal operation mode from the server topology administration page found in the JBoss ON user-interface (UI).

Root Cause

If the JBoss ON server is unable to communicate with at least one storage node from the storage cluster, it will go into maintenance mode. This prevents the loss of measurement and metric data.

A storage node is required for processing of metric data. Because the server is passing large amount of measurement data directly from the agent to the storage node, it will stop accepting new work from agents until its connection to the storage node has been re-established. Once the JBoss ON server is able to reach a member of its configured storage cluster, it will automatically return its operation mode to normal if it had previously put itself in maintenance mode due to loss of connectivity to the storage cluster.

Diagnostic Steps

  • Has the JBoss ON server reported that a storage node is unreachable? This can be seen in the JBoss ON server log:

    ERROR [com.datastax.driver.core.ControlConnection] (Reconnection-1) [Control connection] Cannot connect to any host, scheduling retry in 16000 milliseconds
    
  • The following message will appear in the JBoss ON server log shortly after a connection to the storage cluster fails:

    INFO  [org.rhq.enterprise.server.cloud.instance.ServerManagerBean] (EJB default - 10) Notified communication layer of server operation mode MAINTENANCE
    
  • The following messages will appear in the JBoss ON server log shortly after the connection to the storage cluster has been restored:

    INFO  [org.rhq.enterprise.server.storage.StorageClusterMonitor] (Reconnection-0) Storage cluster is up
    INFO  [org.rhq.enterprise.server.cloud.instance.ServerManagerBean] (EJB default - 1) Notified communication layer of server operation mode NORMAL
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.