Deployment of bundle sometimes fails with "Connection has been shutdown" when JON agent and server are using SSL

Solution Verified - Updated -

Environment

  • Red Hat JBoss Operations Network (ON) 3.0, 3.1, 3.2, 3.3
  • JBoss ON server and agent are using SSL for server to agent communication
  • It has been more then a minute since the server executed an operation on the agent that is managing the destination for the provisioning bundle

Issue

  • Deploy or redeploy of bundle frequently fails
  • The graphical user interface (GUI) gives a message like this:

    Deployment - "Failed to schedule, agent on [Resource[id=10101, uuid=e416e385-01de-5019-2499-f39896d98bd6, type={JBossAS7}JBossAS7 Standalone Server, key=/opt/jboss/jboss-eap/standalone, name=EAP (192.168.1.1:9990), parent=eapserver.example.com, version=EAP 6.1.0.GA]] may be down: java.lang.reflect.UndeclaredThrowableException"
    
  • Server log contains:

    ERROR [org.jboss.remoting.transport.socket.SocketClientInvoker] Got marshalling exception, exiting
    javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe
    
  • server.log includes the following error:

    ERROR [org.rhq.enterprise.communications.command.client.ClientCommandSenderTask] (http-/0.0.0.0:7080-5) {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=EhtA72he/fcxoEpuDEQITmL04UgkR4+Jvmaqz9vcYSE3+8D9XOkC4HnvW/uKbSeMi8Y=, rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[schedule], targetInterfaceName=org.rhq.core.clientapi.agent.bundle.BundleAgentService}]]. Cause: org.jboss.remoting.InvocationFailureException:Unable to perform invocation; nested exception is: 
        javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:java.net.SocketException: Broken pipe -> java.net.SocketException:Broken pipe. Cause: org.jboss.remoting.InvocationFailureException: Unable to perform invocation; nested exception is: 
        javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe
    
  • Deploying a provisioning bundle to a resource fails

Resolution

  • For JBoss ON 3.1.2, this issue has been resolved in the latest server hotfix patch.
  • For JBoss ON 3.2 and 3.3, the agent configuration property rhq.communications.connector.transport-params should be updated to include the setting generalizeSocketException=true. For example, execute the agent prompt command:

    > setconfig rhq.communications.connector.transport-params=numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200&generalizeSocketException=true
    

    This can also be done by adding the necessary configuration to the agent's agent-configuration.xml and using the --cleanconfig command-line option when starting the agent.

    Please note that, the agent must be restarted after making the configuration change.

If unable to use one of the above solutions, it will be necessary to submit a command -- such as the platform's View Processes operation -- to the impacted agent before attempting to deploy a bundle to one of its resources. After submitting a command, you will have less then 1 minute to execute the bundle deploy operation.

Root Cause

This issue is a result of JBoss Remoting bug JBREM-1245. When SSL is enabled a stale connection from the pool will not cause the pool to be repopulated as expected.

When a JBoss ON server submits a command to an agent, it creates a JBoss Remoting connection. If the agent does not receive further commands within 1 minute, it drops the connection. When the server later attempts to re-use that connection from the pool, it is receiving a socket i/o exception that indicates the connection is no longer valid. However, this exception isn't being handled correctly by the remoting implementation and is therefore being passed up to the JBoss ON server and resulting in the deployment of the bundle failing. If a new deploy request is submitted within one minute, the bundle would deploy successfully.

This issue was originally reported in JBoss ON 3.1.2 as Red Hat Bugzilla 1049009 and was addressed with a server hotfix. The fix for JBoss ON 3.2 and later was included in the fix for JBoss Remoting bug JBREM-1245, however, this fix required the JBoss Remoting transport configuration option generalizeSocketException to be set to true in order for the fix to be active. JBoss ON 3.2 and 3.3 do not include the required setting in the transport configuration used for communicating with the agent. Red Hat Bugzilla 1166383 has been captured to report this same deficiency in 3.2 and later and will be addressed in a future release.

Diagnostic Steps

  • Is SSL enabled for the agent that is managing the resource where the bundle is being deployed? This issue will only occur if SSL is enabled for server to agent communication.
  • View the state of the JBoss ON server socket connection going to the affected agent. This can be done using netstat on the server and looking for the agent's IP address and port number in the output:

    netstat -nt | grep 16163
    

    This may reveal output similar to:

    tcp      112      0 192.168.1.50:42187       192.168.1.52:16163        CLOSE_WAIT  java
    

    In this case the server's address is 192.168.1.50 and it has a socket opened to the agent on 192.168.1.52 port 16163. The state of this socket is CLOSE_WAIT. The CLOSE_WAIT state indicates this is a stale connection. The next attempt to deploy a bundle to this agent will fail and this issue applies.

    Note that if the socket state is ESTABLISHED or there is no connection being returned for this agent, the connection is not stale and this issue does not apply.

  • Use JBoss Byteman to capture details surrounding the failure.

    1. If necessary, download and install JBoss Byteman.
    2. Create the necessary Byteman rule:

      cat >/tmp/BZ1166383.btm <<EOF
      ########################################################################
      # To the extent possible under law, Red Hat, Inc. has dedicated all 
      # copyright to this software to the public domain worldwide, pursuant 
      # to the CC0 Public Domain Dedication. This software is distributed 
      # without any warranty.  
      # 
      # See <http://creativecommons.org/publicdomain/zero/1.0/>.
      #
      #
      # JBM-BZ1166383 Rule File
      # 
      # This JBoss Byteman rule set may be helpful in diagnosing JBoss Remoting 
      # errors.
      #
      # See Red Hat Bugzilla 1166383 <https://bugzilla.redhat.com/show_bug.cgi?id=1166383>.
      #
      RULE trace MicroSocketClientInvoker handleException
      CLASS ^org.jboss.remoting.transport.socket.MicroSocketClientInvoker
      METHOD handleException
      AT THROW ALL
      BIND socketInvoker:MicroSocketClientInvoker = \$0;
           generalizedExcept:boolean = socketInvoker.isGeneralizeSocketException();
           exception = \$^
      IF TRUE
      DO System.err.println("JBM-BZ1166383: generalizeSocketException = " + generalizedExcept + "; Exception: " + exception);
         exception.printStackTrace()
      ENDRULE
      EOF
      
    3. Update the following command with the appropriate values and then execute it:

      BYTEMAN_HOME=/usr/share/byteman RHQ_SERVER_ADDITIONAL_JAVA_OPTS="-javaagent:${BYTEMAN_HOME}/lib/byteman.jar=script:/tmp/BZ1166383.btm" "${RHQ_SERVER_HOME}"/bin/rhqctl restart --server
      

      You will need to change the value of BYTEMAN_HOME from /usr/share/byteman to the full path of where you have JBoss Byteman installed. Also, if you used a different location or file name to store the Byteman rule file, you will need to update the script value from /tmp/BZ1166383.btm to the complete path and name of your rule file.

      NOTE: If you do not set the environment variables BYTEMAN_HOME and RHQ_SERVER_ADDITIONAL_JAVA_OPTS as part of the command used to start or restart the JBoss ON server, you must remember to export them so they are available to the child process where JBoss ON server is created. For example:

      BYTEMAN_HOME=/usr/share/byteman 
      RHQ_SERVER_ADDITIONAL_JAVA_OPTS="-javaagent:${BYTEMAN_HOME}/lib/byteman.jar=script:/tmp/BZ1166383.btm" 
      export BYTEMAN_HOME
      export RHQ_SERVER_ADDITIONAL_JAVA_OPTS
      "${RHQ_SERVER_HOME}"/bin/rhqctl restart --server
      

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.