Red Hat Training

A Red Hat training course is available for Red Hat JBoss Web Server

Chapter 15. HTTP Session State Replication

Software Load Balancer

A dedicated software-based service designed to distribute HTTP client session requests across multiple computer servers (cluster). The primary directive of a software load balancer is to maximize resource utilization, reduce request response times, and prevent server overload. The load balancer forwards client session requests to a server cluster, based on server load and availability.

Client Session

A semi-permanent connection between the client (an application) and the server. The load balancer determines whether the client session is created with persistence, or whether a client session is redistributed based on server load and availability.

Session Persistence

A client session that is exclusively allocated to a single server instance. The load balancer routes all HTTP requests associated with the client session to the allocated server instance only. Session persistence is commonly referred to as a sticky session.

Sticky Session

See Session Persistence.

Section 3.1, “Configure Worker Nodes in mod_jk” describes how to configure session state persistence in the load balancer to ensure a client in a session is always routed to the same server node.
Session persistence on its own is not a best-practice solution because if a server fails, all session state data is lost. For example, if a customer is about to make a purchase on a web site, and the server hosting the shopping cart instance fails, session state data associated with the cart is lost permanently.
One way of preventing client session data loss is to replicate session data across the servers in the cluster. If a server node fails or is shut down, the load balancer can fail over the next client request to any server node and obtain the same session state.
Using a load-balancer that supports session persistence, but not configuring web applications for session replication allows you to scale your implementation by avoiding the cost of session state replication: each request for a session will always be handled by the same node.
Session state replication is more expensive than basic session persistence, but the reliability it provides for session state data makes it important when creating a load balanced cluster.

15.1. Enabling session replication in your application

To enable replication of your web application you must tag the application as distributable in the web.xml descriptor. Here's an example:
<?xml version="1.0"?> 
<web-app  xmlns="http://java.sun.com/xml/ns/j2ee"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee 
                              http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" 
          version="2.4">
          
    <distributable/>
    
</web-app>
You can further configure session replication using the replication-config element in the jboss-web.xml file. However, the replication-config element only needs to be set if one or more of the default values described below is unacceptable. Here is an example:
<!DOCTYPE jboss-web PUBLIC
    -//JBoss//DTD Web Application 5.0//EN
    http://www.jboss.org/j2ee/dtd/jboss-web_5_0.dtd>

<jboss-web>
   
   <replication-config>
      <cache-name>custom-session-cache</cache-name>
      <replication-trigger>SET</replication-trigger>
      <replication-granularity>ATTRIBUTE</replication-granularity>
      <replication-field-batch-mode>true</replication-field-batch-mode>
      <use-jk>false</use-jk>
      <max-unreplicated-interval>30</max-unreplicated-interval>
      <snapshot-mode>INSTANT</snapshot-mode>
      <snapshot-interval>1000</snapshot-interval>
      <session-notification-policy>com.example.CustomSessionNotificationPolicy</session-notification-policy>
   </replication-config>

</jboss-web>
All of the configuration elements are optional, and can be omitted if the default value is acceptable. A couple are commonly used; the rest are very infrequently changed from the defaults. We'll cover the commonly used ones first.
The <replication-trigger> element determines when the container should consider that session data must be replicated across the cluster. The rationale for this setting is that after a mutable object stored as a session attribute is accessed from the session, in the absence of a setAttribute call the container has no clear way to know if the object (and hence the session state) has been modified and needs to be replicated. This element has 3 valid values:
  • SET_AND_GET is conservative but not optimal (performance-wise): it will always replicate session data even if its content has not been modified but simply accessed. This setting made (a little) sense in JBoss Enterprise Application Platform 4 since using it was a way to ensure that every request triggered replication of the session's timestamp. Since setting max_unreplicated_interval to 0 accomplishes the same thing at much lower cost, using SET_AND_GET makes no sense with Enterprise Application Platform 5.
  • SET_AND_NON_PRIMITIVE_GET is conservative but will only replicate if an object of a non-primitive type has been accessed (i.e. the object is not of a well-known immutable JDK type such as Integer, Long, String, etc.) This is the default value.
  • SET assumes that the developer will explicitly call setAttribute on the session if the data needs to be replicated. This setting prevents unnecessary replication and can have a major beneficial impact on performance, but requires very good coding practices to ensure setAttribute is always called whenever a mutable object stored in the session is modified.
In all cases, calling setAttribute marks the session as needing replication.
The <replication-granularity> element determines the granularity of what gets replicated if the container determines session replication is needed. The supported values are:
SESSION
Specifies the entire session attribute map should be replicated when any attribute is considered modified. Replication occurs at request end. This option replicates the most data and thus incurs the highest replication cost, but since all attributes values are always replicated together it ensures that any references between attribute values will not be broken when the session is deserialized. For this reason it is the default setting.
ATTRIBUTE
Specifies only attributes that the session considers to be potentially modified are replicated. Replication occurs at request end. For sessions carrying large amounts of data, parts of which are infrequently updated, this option can significantly increase replication performance. However, it is not suitable for applications that store objects in different attributes that share references with each other (e.g. a Person object in the "husband" attribute sharing with another Person in the "wife" attribute a reference to an Address object). This is because if the attributes are separately replicated, when the session is deserialized on remote nodes the shared references will be broken.
The other elements under the replication-config element are much less frequently used.
<cacheName>
Specifies the name of the JBoss Cache configuration that should be used for storing distributable sessions and replicating them around the cluster. This element lets web applications that require different caching characteristics specify the use of separate, differently configured, JBoss Cache instances. In JBoss Enterprise Application Platform 4 the cache to use was a server-wide configuration that could not be changed per web application. The default value is standard-session-cache See Section 15.3, “Configuring the JBoss Cache instance used for session state replication” for more details on JBoss Cache configuration for web tier clustering.
<replication-field-batch-mode>
Specifies whether all replication messages associated with a request will be batched into one message. This is applicable only if replication-granularity is FIELD. If replication-field-batch-mode is set to true, fine-grained changes made to objects stored in the session attribute map will replicate only when the HTTP request is finished; otherwise they replicate as they occur. Setting this to false is not advised. Default is true.
<useJK>
Specifies whether the container should assume that a JK-based software load balancer (e.g. mod_jk, mod_proxy, mod_cluster) is being used for load balancing for this web application. If set to true, the container will examine the session ID associated with every request and replace the jvmRoute portion of the session ID if it detects a failover.
You need only set this to false for web applications whose URL cannot be handled by the JK load balancer.
<max-unreplicated-interval>
Specifies the maximum interval between requests, in seconds, after which a request will trigger replication of the session's timestamp regardless of whether the request has otherwise made the session dirty. Such replication ensures that other nodes in the cluster are aware of the most recent value for the session's timestamp and won't incorrectly expire an unreplicated session upon failover. It also results in correct values for HttpSession.getLastAccessedTime() calls following failover.
The default value is null (i.e. unspecified). In this case the session manager will use the presence or absence of a jvmRoute configuration on its enclosing JBoss Web Engine (see Section 3.2, “Configuring JBoss to work with mod_jk”) to determine whether JK is used.
A value of 0 means the timestamp will be replicated whenever the session is accessed. A value of -1 means the timestamp will be replicated only if some other activity during the request (e.g. modifying an attribute) has resulted in other replication work involving the session. A positive value greater than the HttpSession.getMaxInactiveInterval() value will be treated as probable misconfiguration and converted to 0; i.e. replicate the metadata on every request. Default value is 60.
<snapshot-mode>
Specifies when sessions are replicated to the other nodes. Possible values are INSTANT (the default) and INTERVAL.
The typical value, INSTANT, replicates changes to the other nodes at the end of requests, using the request processing thread to perform the replication. In this case, the snapshot-interval property is ignored.
With INTERVAL mode, a background task is created that runs every snapshot-interval milliseconds, checking for modified sessions and replicating them.
Note that this property has no effect if replication-granularity is set to FIELD. If it is FIELD, instant mode will be used.
<snapshot-interval>
Specifies how often (in milliseconds) the background task that replicates modified sessions should be started for this web application. Only meaningful if snapshot-mode is set to interval.
<session-notification-policy>
Specifies the fully qualified class name of the implementation of the ClusteredSessionNotificationPolicy interface that should be used to govern whether servlet specification notifications should be emitted to any registered HttpSessionListener, HttpSessionAttributeListener and/or HttpSessionBindingListener.

Important

Event notifications that may be appropriate in non-clustered environment may not necessarily be appropriate in a clustered environment; see https://jira.jboss.org/jira/browse/JBAS-5778 for an example of why a notification may not be desired. Configuring an appropriate ClusteredSessionNotificationPolicy gives the application author fine-grained control over what notifications are issued.