JBoss node can't rejoin with mod_cluster due to IllegalArgumentException

Solution Verified - Updated -

Environment

  • JBoss Enterprise Application Platform
  • JBoss Enterprise Web Server (EWS)
    • 1.0.1 and earlier
  • JBCS Apache HTTPD
    • 2.4.x

Issue

  • A JBoss node goes down and it can't rejoin mod_cluster after coming back up.  Errors like the following are shown continuously in the server.log:
    
    ERROR [org.apache.catalina.core.ContainerBase] (ContainerBackgroundProcessor[StandardEngine[jboss.web]]) Exception invoking periodic operation:
    java.lang.IllegalArgumentException: Vhost: [1:1:1], Alias: localhost
    Vhost: [2:1:2], Alias: localhost
    Context: [1:1:1], Context: /app, Status: ENABLED
    Context: [1:1:2], Context: /app2, Status: ENABLED
    Context: [2:1:3], Context: /app, Status: ENABLED
    Context: [2:1:4], Context: /app2, Status: ENABLED
    
            at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.parseInfoResponse(DefaultMCMPHandler.java:549)
            at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:475)
            at org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler.status(DefaultMCMPHandler.java:446)
            at org.jboss.modcluster.CatalinaEventHandler.status(CatalinaEventHandler.java:331)
            at org.jboss.modcluster.CatalinaEventHandler.status(CatalinaEventHandler.java:56)
            at org.jboss.modcluster.CatalinaEventHandlerAdapter.lifecycleEvent(CatalinaEventHandlerAdapter.java:165)
            at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117)
            at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1348)
            at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1612)
            at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1601)
            at java.lang.Thread.run(Thread.java:662)
    
  • Getting error on All workers are in error state in Apache HTTPD logs

Resolution

  • Upgrade to EWS 1.0.2 and mod_cluster 1.0.10.GA_CP01+
  • If you see such an IllegalArgumentException on EAP 6, ensure that the mod_cluster 1.2.x httpd modules are used and not old 1.0 mod_cluster modules that are intended for EAP 5.
  • If seeing the Old node still exist errors below, see mod_cluster repeats error "MEM: Old node still exist".
  • As a workaround backup mod_cluster slotmem (MemManagerFile) files to another location and restart httpd to clear any bad state that httpd got into.

Root Cause

  • The relevant code:
  private Map<String, Set<VirtualHost>> parseInfoResponse(String response)
  {
...
    for (String line : response.split("\r\n|\r|\n"))
    {
      if (line.startsWith("Node:"))
      {
...
          String value = entry.substring(index + 1).trim();
...
          nodeMap.put(nodeId, value);
          virtualHostMap.put(value, new HashMap());
          break;
        }

      }
      else if (line.startsWith("Vhost:"))
      {
...
        String node = (String)nodeMap.get(ids[0]);

        if (node == null)
        {
          throw new IllegalArgumentException(response);  //line 549
        }</code>
  • No "Node:" format line is included in the response so the IllegalArgumentException is generated due to null node info when the "VHost:" info is processed.
  • This ill formatted response is likely generated due to a Host not being properly removed:
  • JBoss nodes are unexpectedly connecting to an Apache server using the same JVM routes to mix up worker states

Diagnostic Steps

  • Using the mod_cluster_manager page, check the DUMP or INFO output to see if there is a host but no node corresponding to it.
  • Check JBoss logging for MCMP failing like so:
ERROR [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] Error [MEM: MEM: Old node still exist: {4}] sending command CONFIG to proxy 127.0.0.1:6666, configuration will be reset
[notice] Created: can't reuse worker for ajp://127.0.0.1:8009
  • The strings command can be used to get a rough idea of contents of mod_cluster slotmem files

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.