For the servlet container, when deployed using clustering and a cluster aware application, HTTP session replication becomes a key aspect of performance and scalability across the cluster. There are two main methods for HTTP session replication: full replication, in which HTTP sessions are replicated to all nodes in the cluster, and buddy replication, where each node has at least one buddy, with one being the default. Each replication method has the ability to do full object replication, and fined grained replication, where only changes in the HTTP session are replicated.
Full replication is the default configuration because with most clusters having only a few nodes it provides maximum benefit with minimum configuration overhead. When there are more than two nodes, the method of replication needs to be considered as each option offers different benefits. Although full replication requires the least configuration, as the number of nodes increases the overhead of maintaining inter-node connections increases. Factors which need to be considered include: how often new sessions are created, how often old sessions are removed, how often the data in the HTTP session changes, and the session's size. Making a decision requires answers to these questions and relies on knowledge of each applications' usage pattern of HTTP sessions.
To illustrate this with a practical example, consider a clustered application which uses the HTTP session to store a reference to a single stateful session bean. The bean is reused by the client to scroll through a result set derived from a database query. Even when the result set is no longer needed, and replaced within the stateful session bean with another result set, the stateful session bean session is not canceled, but instead a method is called that resets the state. So the HTTP session stays the same for the duration of the clients' requests, no matter how many are made and how long they are active. This usage pattern is very static, and involves a very small amount of data. It will scale very well with full replication, as the default network protocol is a reliable multicast over UDP. One packet (even assuming a 1,500 byte Ethernet frame size) will be sent over the network for all nodes to retrieve when a new HTTP session is created, and when one is removed for each user. In this case it's an efficient operation even across a fairly large cluster because the memory overhead is very small since the HTTP session is just holding a single reference. The size of the HTTP session is just one factor to consider. Even if the HTTP session is moderate in size, the amount is memory used calculated as:
HTTP session size
x active sessions
x number of nodes
As the number of clients and/or cluster nodes increases, the amount of memory in use across all nodes rapidly increases. The amount of memory in use is the same across all cluster nodes because they each need to maintain the same information, despite the fact that each client is communicating with only one node at a time.
In summary, whether you should use the default configuration of full replication is really up to your particular workload. If you have multiple applications deployed at the same time, the complexity increases, as you might have applications that behave quite differently between each other, where the HTTP session is concerned. If you have a cluster larger than two nodes, and you are doing HTTP session replication, then consider buddy replication as your starting point.