Deployment of bundle sometimes fails with "Connection has been shutdown" when JON agent and server are using SSL
Environment
- Red Hat JBoss Operations Network (ON) 3.0, 3.1, 3.2, 3.3
- JBoss ON server and agent are using SSL for server to agent communication
- It has been more then a minute since the server executed an operation on the agent that is managing the destination for the provisioning bundle
Issue
- Deploy or redeploy of bundle frequently fails
-
The graphical user interface (GUI) gives a message like this:
Deployment - "Failed to schedule, agent on [Resource[id=10101, uuid=e416e385-01de-5019-2499-f39896d98bd6, type={JBossAS7}JBossAS7 Standalone Server, key=/opt/jboss/jboss-eap/standalone, name=EAP (192.168.1.1:9990), parent=eapserver.example.com, version=EAP 6.1.0.GA]] may be down: java.lang.reflect.UndeclaredThrowableException" -
Server log contains:
ERROR [org.jboss.remoting.transport.socket.SocketClientInvoker] Got marshalling exception, exiting javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -
server.logincludes the following error:ERROR [org.rhq.enterprise.communications.command.client.ClientCommandSenderTask] (http-/0.0.0.0:7080-5) {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=EhtA72he/fcxoEpuDEQITmL04UgkR4+Jvmaqz9vcYSE3+8D9XOkC4HnvW/uKbSeMi8Y=, rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[schedule], targetInterfaceName=org.rhq.core.clientapi.agent.bundle.BundleAgentService}]]. Cause: org.jboss.remoting.InvocationFailureException:Unable to perform invocation; nested exception is: javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:java.net.SocketException: Broken pipe -> java.net.SocketException:Broken pipe. Cause: org.jboss.remoting.InvocationFailureException: Unable to perform invocation; nested exception is: javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -
Deploying a provisioning bundle to a resource fails
Resolution
- For JBoss ON 3.1.2, this issue has been resolved in the latest server hotfix patch.
-
For JBoss ON 3.2 and 3.3, the agent configuration property
rhq.communications.connector.transport-paramsshould be updated to include the settinggeneralizeSocketException=true. For example, execute the agent prompt command:> setconfig rhq.communications.connector.transport-params=numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200&generalizeSocketException=trueThis can also be done by adding the necessary configuration to the agent's
agent-configuration.xmland using the--cleanconfigcommand-line option when starting the agent.Please note that, the agent must be restarted after making the configuration change.
If unable to use one of the above solutions, it will be necessary to submit a command -- such as the platform's View Processes operation -- to the impacted agent before attempting to deploy a bundle to one of its resources. After submitting a command, you will have less then 1 minute to execute the bundle deploy operation.
Root Cause
This issue is a result of JBoss Remoting bug JBREM-1245. When SSL is enabled a stale connection from the pool will not cause the pool to be repopulated as expected.
When a JBoss ON server submits a command to an agent, it creates a JBoss Remoting connection. If the agent does not receive further commands within 1 minute, it drops the connection. When the server later attempts to re-use that connection from the pool, it is receiving a socket i/o exception that indicates the connection is no longer valid. However, this exception isn't being handled correctly by the remoting implementation and is therefore being passed up to the JBoss ON server and resulting in the deployment of the bundle failing. If a new deploy request is submitted within one minute, the bundle would deploy successfully.
This issue was originally reported in JBoss ON 3.1.2 as Red Hat Bugzilla 1049009 and was addressed with a server hotfix. The fix for JBoss ON 3.2 and later was included in the fix for JBoss Remoting bug JBREM-1245, however, this fix required the JBoss Remoting transport configuration option generalizeSocketException to be set to true in order for the fix to be active. JBoss ON 3.2 and 3.3 do not include the required setting in the transport configuration used for communicating with the agent. Red Hat Bugzilla 1166383 has been captured to report this same deficiency in 3.2 and later and will be addressed in a future release.
Diagnostic Steps
- Is SSL enabled for the agent that is managing the resource where the bundle is being deployed? This issue will only occur if SSL is enabled for server to agent communication.
-
View the state of the JBoss ON server socket connection going to the affected agent. This can be done using
netstaton the server and looking for the agent's IP address and port number in the output:netstat -nt | grep 16163This may reveal output similar to:
tcp 112 0 192.168.1.50:42187 192.168.1.52:16163 CLOSE_WAIT javaIn this case the server's address is 192.168.1.50 and it has a socket opened to the agent on 192.168.1.52 port 16163. The state of this socket is
CLOSE_WAIT. TheCLOSE_WAITstate indicates this is a stale connection. The next attempt to deploy a bundle to this agent will fail and this issue applies.Note that if the socket state is
ESTABLISHEDor there is no connection being returned for this agent, the connection is not stale and this issue does not apply. -
Use JBoss Byteman to capture details surrounding the failure.
- If necessary, download and install JBoss Byteman.
-
Create the necessary Byteman rule:
cat >/tmp/BZ1166383.btm <<EOF ######################################################################## # To the extent possible under law, Red Hat, Inc. has dedicated all # copyright to this software to the public domain worldwide, pursuant # to the CC0 Public Domain Dedication. This software is distributed # without any warranty. # # See <http://creativecommons.org/publicdomain/zero/1.0/>. # # # JBM-BZ1166383 Rule File # # This JBoss Byteman rule set may be helpful in diagnosing JBoss Remoting # errors. # # See Red Hat Bugzilla 1166383 <https://bugzilla.redhat.com/show_bug.cgi?id=1166383>. # RULE trace MicroSocketClientInvoker handleException CLASS ^org.jboss.remoting.transport.socket.MicroSocketClientInvoker METHOD handleException AT THROW ALL BIND socketInvoker:MicroSocketClientInvoker = \$0; generalizedExcept:boolean = socketInvoker.isGeneralizeSocketException(); exception = \$^ IF TRUE DO System.err.println("JBM-BZ1166383: generalizeSocketException = " + generalizedExcept + "; Exception: " + exception); exception.printStackTrace() ENDRULE EOF -
Update the following command with the appropriate values and then execute it:
BYTEMAN_HOME=/usr/share/byteman RHQ_SERVER_ADDITIONAL_JAVA_OPTS="-javaagent:${BYTEMAN_HOME}/lib/byteman.jar=script:/tmp/BZ1166383.btm" "${RHQ_SERVER_HOME}"/bin/rhqctl restart --serverYou will need to change the value of
BYTEMAN_HOMEfrom/usr/share/bytemanto the full path of where you have JBoss Byteman installed. Also, if you used a different location or file name to store the Byteman rule file, you will need to update the script value from/tmp/BZ1166383.btmto the complete path and name of your rule file.NOTE: If you do not set the environment variables
BYTEMAN_HOMEandRHQ_SERVER_ADDITIONAL_JAVA_OPTSas part of the command used to start or restart the JBoss ON server, you must remember to export them so they are available to the child process where JBoss ON server is created. For example:BYTEMAN_HOME=/usr/share/byteman RHQ_SERVER_ADDITIONAL_JAVA_OPTS="-javaagent:${BYTEMAN_HOME}/lib/byteman.jar=script:/tmp/BZ1166383.btm" export BYTEMAN_HOME export RHQ_SERVER_ADDITIONAL_JAVA_OPTS "${RHQ_SERVER_HOME}"/bin/rhqctl restart --server
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
