JBoss node can't rejoin with mod_cluster due to IllegalArgumentException

  • JBoss Enterprise Application Platform
  • JBoss Enterprise Web Server (EWS)
    • 1.0.1 and earlier
  • JBCS Apache HTTPD
    • 2.4.x


  • A JBoss node goes down and it can't rejoin mod_cluster after coming back up.  Errors like the following are shown continuously in the server.log:
    ERROR [org.apache.catalina.core.ContainerBase] (ContainerBackgroundProcessor[StandardEngine[jboss.web]]) Exception invoking periodic operation:
    java.lang.IllegalArgumentException: Vhost: [1:1:1], Alias: localhost
    Vhost: [2:1:2], Alias: localhost
    Context: [1:1:1], Context: /app, Status: ENABLED
    Context: [1:1:2], Context: /app2, Status: ENABLED
    Context: [2:1:3], Context: /app, Status: ENABLED
    Context: [2:1:4], Context: /app2, Status: ENABLED
            at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.parseInfoResponse(DefaultMCMPHandler.java:549)
            at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:475)
            at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:446)
            at org.jboss.modcluster.CatalinaEventHandler.status(CatalinaEventHandler.java:331)
            at org.jboss.modcluster.CatalinaEventHandler.status(CatalinaEventHandler.java:56)
            at org.jboss.modcluster.CatalinaEventHandlerAdapter.lifecycleEvent(CatalinaEventHandlerAdapter.java:165)
            at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
            at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1348)
            at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1612)
            at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1601)
            at java.lang.Thread.run(Thread.java:662)
  • Getting error on All workers are in error state in Apache HTTPD logs


  • Upgrade to EWS 1.0.2 and mod_cluster 1.0.10.GA_CP01+
  • If you see such an IllegalArgumentException on EAP 6, ensure that the mod_cluster 1.2.x httpd modules are used and not old 1.0 mod_cluster modules that are intended for EAP 5.
  • If seeing the Old node still exist errors below, see mod_cluster repeats error "MEM: Old node still exist".
  • As a workaround backup mod_cluster slotmem (MemManagerFile) files to another location and restart httpd to clear any bad state that httpd got into.

Root Cause

  • The relevant code:
  private Map<String, Set<VirtualHost>> parseInfoResponse(String response)
    for (String line : response.split("\r\n|\r|\n"))
      if (line.startsWith("Node:"))
          String value = entry.substring(index + 1).trim();
          nodeMap.put(nodeId, value);
          virtualHostMap.put(value, new HashMap());

      else if (line.startsWith("Vhost:"))
        String node = (String)nodeMap.get(ids[0]);

        if (node == null)
          throw new IllegalArgumentException(response);  //line 549
  • No "Node:" format line is included in the response so the IllegalArgumentException is generated due to null node info when the "VHost:" info is processed.
  • This ill formatted response is likely generated due to a Host not being properly removed:
  • JBoss nodes are unexpectedly connecting to an Apache server using the same JVM routes to mix up worker states

Diagnostic Steps

  • Using the mod_cluster_manager page, check the DUMP or INFO output to see if there is a host but no node corresponding to it.
  • Check JBoss logging for MCMP failing like so:
ERROR [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] Error [MEM: MEM: Old node still exist: {4}] sending command CONFIG to proxy, configuration will be reset
[notice] Created: can't reuse worker for ajp://
  • The strings command can be used to get a rough idea of contents of mod_cluster slotmem files

