Configuring "jboss.as.management.blocking.timeout" to Resolve Timeout in JBoss EAP 6/7

Solution Verified - Updated -

Environment

  • Red Hat JBoss Enterprise Application Platform (EAP)
    • 6.3 or later
    • 7.x
  • Deploying New War/EAR or Starting Up Servers
  • Large deployment, Underpowered VM/Machine, long startup time

Issue

  • The deployment produces a ".failed" file in the deployments folder, rather than a ".deployment" file.
  • Slave servers don't start, master has error in log:

    • Operation timeout awaiting service container stability
    • In JBoss EAP 7

      [2019-06-14 17:32:36,649+0000] ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[
          ("core-service" => "management"),
          ("management-interface" => "http-interface")
      ]'
      
    • In JBoss EAP 6

      ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) JBAS013412: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[("interface" => "management")]'
      

      or

      ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) JBAS013412: Timeout after [600] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[
          ("core-service" => "management"),
          ("management-interface" => "native-interface")
      ]'
      
  • Unable to deploy WAR/EAR file, master server returns deployment error

    "WFLYDC0074: Operation failed or was rolled back on all servers. Server failures:" => {"server-group" => {"main-server-group" => {"host" => {"slave1" => {"mainserver1" => "WFLYCTL0409: Execution of operation 'deploy' on remote process at address '[
        (\"host\" => \"slave1\"),
        (\"server\" => \"mainserver1\")
    ]' timed out after 305000 ms while awaiting initial response; remote process has been notified to terminate operation"}}}}}},
        "rolled-back" => true,
        "server-groups" => {"cbx-server-group" => {"host" => {"slave1" => {"mainserver1" => {"response" => {
            "outcome" => "failed",
            "result" => undefined,
            "failure-description" => "WFLYCTL0409: Execution of operation 'deploy' on remote process at address '[
        (\"host\" => \"slave1\"),
        (\"server\" => \"mainserver1\")
    ]' timed out after 305000 ms while awaiting initial response; remote process has been notified to terminate operation",
            "rolled-back" => true
        }}}}}}
    
  • Has bugzilla issue 1117945 been fixed?

Resolution

Configure jboss.as.management.blocking.timeout system property to tune the timeout (seconds) waiting for service container stability. The default is 300 seconds.

See the solution Add/remove/update system properties in JBoss EAP 6/7 for how to set system properties in JBoss EAP in various modes of operation.

In domain mode, users must set it per server

/host=master/server-config=server-one/system-property=jboss.as.management.blocking.timeout:add(boot-time=true,value=600)  

And set it for the domain controller

  • Command line

    • Red Hat Enterprise Linux: bin/domain.sh -Djboss.as.management.blocking.timeout=600
    • Microsoft Windows: bin\domain.bat -Djboss.as.management.blocking.timeout=600
  • Configuration file for service startup

    • domain.conf

      PROCESS_CONTROLLER_JAVA_OPTS="$PROCESS_CONTROLLER_JAVA_OPTS -Djboss.as.management.blocking.timeout=600
      
    • Edit domain.bat

      "%JAVA%" %PROCESS_CONTROLLER_JAVA_OPTS% ^
          "-Dorg.jboss.boot.log.file=%JBOSS_LOG_DIR%\process-controller.log" ^
          "-Dlogging.configuration=file:%JBOSS_CONFIG_DIR%/logging.properties" ^
          -jar "%JBOSS_HOME%\jboss-modules.jar" ^
          %MODULE_OPTS% ^
          -mp "%JBOSS_MODULEPATH%" ^
          org.jboss.as.process-controller ^
          -jboss-home "%JBOSS_HOME%" ^
          -jvm "%JAVA%" ^
          %MODULE_OPTS% ^
          -mp "%JBOSS_MODULEPATH%" ^
          -- ^
          "-Dorg.jboss.boot.log.file=%JBOSS_LOG_DIR%\host-controller.log" ^
          "-Dlogging.configuration=file:%JBOSS_CONFIG_DIR%/logging.properties" ^
          %HOST_CONTROLLER_JAVA_OPTS% ^
          -- ^
          -default-jvm "%JAVA%" ^
          -Djboss.as.management.blocking.timeout=600 ^
          %*
      

Notes

  • The only use case for setting a property at the server-group or individual domain server level would be if users wanted a lower timeout than what users configure on the host controllers.
  • The range of values is 1 to 2147483 seconds, setting the value to 0 will log a message and set it to 300 seconds.

Root Cause

There is a Classjboss.as.controller: BlockingTimeout. This class loads the value of system property jboss.as.management.blocking.timeout or defaults to 300 (seconds). This property is not used as as timeout per deployment but a timeout on container stability and if jboss.as.management.blocking.timeout is reached during startup then all applications will be undeployed and the container shutdown. The reasoning behind this is that having a half-working server is potentially dangerous as users may not notice major failures.

Diagnostic Steps

  • Confirm whether a virus scanner is installed to the server
  • If the issue was caused after making changes in application, check what exactly has changed since it was working previously? Was something upgraded? Added? Pointing to a new database or other external system?
  • Collect a series of thread dumps during the startup period so we can see what it might be getting stuck on.
  • Make sure to add the setting to the domain.conf on the Domain controller and Slave Host controller

Once users collect then Red Hat can analyse them and to see whether there is some sort of deadlock or resource that the threads are waiting on that's preventing them from completing the deployment etc.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

25 Comments

I have tried every option offered to increase my standalone bootup timeout, the CLI option above, adding the JAVA_OPTS="$JAVA_OPTS -Djboss.as.management.blocking.timeout=600" to standalone.conf - it remains firmly stuck on 30 seconds. Any assistance will be appreciated. Using jboss-eap-6.3

Try adding it to the standalone.xml file between extensions and management as:

...

</extensions>
   <system-properties>
         <property name="jboss.as.management.blocking.timeout" value="6"/> 
   </system-properties>

<management>

...

I will update this solution as it is not clear. Thanks for highlighting this.

Tom -- experienced similar when trying to add to standalone configuration with CLI. I'm not able to add the property as I am given the response:
'boot-time' is not found among the supported properties: [value]

Try adding it without the boot-time part

ie.

[standalone@localhost:9999 /] /system-property=jboss.as.management.blocking.timeout:add(value=600)

I will update this solution as it is not clear. Thanks for highlighting this.

Thanks. My understanding is that there is not actually an overall bootup time limit, see root cause above. The fact it seemed to be 30 seconds was pretty much coincidental. The error message could probably be worded differently :-)

This worked for us (increasing timeout to value 600. ) But I am curious .... will increasing the memory (Xms and Xmx) JAVA_OPTS improve the start up times and hence require lesser blocking timeout value ?

RedHat Team, Instead of just increasing the blocking timeout value beyond 300 secs without actually looking into the root cause of why there is an increase in the amount of time being spent for container stability couldn't actually hide the problem ? Any suggestions which you could provide to troubleshoot further regarding the timeout as to why its happening ? Thanks, Pavan Tatikonda

https://access.redhat.com/support/cases/#/case/01592258 filed to further triage this timeout issue.

mentioned URL not accesible from other customers. so it makse no sense to post own support tickets. let us know what you get as furtehr steps as I tried it, too and got only the answer to increase the timeout. Thanks, Michael

I have increased the timeout to 600, but still facing the same issue. the same code works fine in production without any timeout. we have a mccafee antivirus scanner running in that machine, am not sure what will be the impact of it.

If the Antivirus is slowing down the reading of the deployment, such as if it is a compressed ear/war and is being unzipped and the antivirus is intercepting and slowing that down it could cause it to take much longer to deploy and hit the timeout.

where's the official documentation on this setting? (and other jboss settings). ie, what is its default, max/min acceptable values, can we set it to "infinite"/ remove it ? (ie, what if I don't want a startup timeout at all ?)

The default is 300 sec as mentioned in the article above. You cannot set it to infinity, if it is set to less than 0 it will default to 300 sec. You could set it to a large number, but that would not be recommended. ie if you have an application issue or the application is trying to access data from another server that is down, then the application will effectively hang startup since it will have no timeout and you will not know there is an issue.

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

When starting JBoss, everything (applications) is starting concurrently once their dependencies/services are resolved/started. If any of the applications does not finish starting during this time, then JBoss is unable to reach stability/started state and will hit this timeout.

what does "

a timeout on container stability " exactly mean? ;; container stability = the app.war is deployed succefully ?

For what it's worth I still run into occasional timeout messages and have been able to ignore them. I'll receive a message that deployment failed due to timeout but wait a while longer then refresh the console and the app's running just fine. Observed in EAP 7.0 and 7.2.