Distribution mode cache for the web subsystem breaks session stickiness

Solution Unverified - Updated -

Environment

  • JBoss Enterprise Application Platform (EAP) 6.4.10 and earlier
  • JBoss Enterprise Application Platform (EAP) 7.x

Issue

  • Distribution mode cache for the web subsystem breaks session stickiness. When using dist mode, if you remove a worker from the balancer (e.g. disabling worker by using balancer-manager in mod_proxy, or disabling worker by using jkstatus in mod_jk), but it is still part of the cluster other nodes will continually send any requests stuck to the original node back to the original node without updating the jvmRoute. This causes sessions to 'spray' all over the cluster without sticking to a new node, which can cause session locking issues.

Resolution

EAP 6

This saw a change released in 6.4.11. By setting -Djvmroute.dist.local.node=true on 6.4.11 and later, dist mode will then sticky a request to the current local node instead of try to continually sticky requests to a potentially unreachable dist session owner and causing session spraying.
Otherwise, an immediate workaround is to us the replicated cache mode for the web subsystem instead of dist mode.

EAP 7

Since EAP 7.3.0+ with the new feature implemented through WFLY-6944/JBEAP-6078/JBEAP-14797, you can configure <local-affinity/> instead of <primary-owner-affinity/> for <infinispan-session-management> setting in the distributable-web subsystem. It will add the current local node as jvmRoute suffix of the JSESSIONID Cookie instead of adding the primary cache owner of the session.

        <subsystem xmlns="urn:jboss:domain:distributable-web:2.0" default-session-management="default" default-single-sign-on-management="default">
            <infinispan-session-management name="default" cache-container="web" granularity="SESSION">
                <local-affinity/> <!-- replace from <primary-owner-affinity/> -->
            </infinispan-session-management>
            <infinispan-single-sign-on-management name="default" cache-container="web" cache="sso"/>
            <infinispan-routing cache-container="web" cache="routing"/>
        </subsystem>

Note that the same behavior happens with both "dist" and "repl" modes in EAP 7.x because "repl" is implemented as a special case of "dist" (i.e. where owners == cache view size). So, unlike EAP 6, changing from "distributed-cache" to "replicated-cache" can not be used as workaround.

Root Cause

  • This is the result of the design of distribution mode. When a node gets a request and does not own the session, then instead of making a network call to get the session information, it sends the request back to the originating node, thereby saving a network call.
  • For EAP 6: BZ-1233400

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments