Chapter 2. Red Hat JBoss Enterprise Application Platform 7

2.1. Overview

Red Hat JBoss Enterprise Application Platform (EAP) 7 is a fast, secure and powerful middleware platform built upon open standards and compliant with the Java Enterprise Edition (Java EE) 7 specification. It provides high-availability clustering, powerful messaging, distributed caching and other technologies to create a stable and scalable platform.

The modular structure allows for services to be enabled only when required, significantly increasing start-up speed. The Management Console and Management CLI remove the need to edit XML configuration files by hand, adding the ability to script and automate tasks. In addition, it includes APIs and development frameworks that can be used to develop secure, powerful, and scalable Java EE applications quickly.

2.2. Clustering

Clustering refers to using multiple resources, such as servers, as though they were a single entity.

In its simplest form, horizontal scaling can be accomplished by using load balancing to distribute the load between two or more servers. Such an approach quickly becomes problematic when the server holds non-persistent and in-memory state. Such a state is typically associated with a client session. Sticky load balancing attempts to address these concerns by ensuring that client requests tied to the same session are always sent to the same server instance. For web requests, sticky load balancing can be accomplished through either the use of a hardware load balancer, or a web server with a software load balancer component. This architecture covers the use of a web server with a specific load balancing software but the principles remain largely the same for other load balancing solutions, including those that are hardware based.

While this sticky behavior protects callers from loss of data due to load balancing, it does not address the potential failure of a node holding session data. Session replication provides a copy of the session data, in one or several other nodes of the cluster, so they can fully compensate for the failure of the original node.

Session replication, or any similar approach to provide redundancy, presents a tradeoff between performance and reliability. Replicating data through network communication or persisting it on the file system is a performance cost and keeping in-memory copies on other nodes is a memory cost. The alternative, however, is a single point of failure where the session is concerned.

JBoss EAP 7 supports clustering at several different levels and provides both load balancing and failover benefits. Some of the subsystems that can be made highly available include:

  • Instances of the Application Server
  • The Web Subsystem / Servlet Container
  • EJB, including stateful, stateless, and entity beans
  • Java Naming and Directory Interface (JNDI) services
  • Single Sign On (SSO) Mechanisms
  • Distributed cache
  • HTTP sessions
  • Java Message Service (JMS) and message-driven beans (MDBs)

Ideally, a cluster of JBoss EAP 7 servers is viewed as a single EAP 7 instance and the redundancy and replication is transparent to the caller.

Figure 2.1. Red Hat JBoss EAP 7 cluster

EAP 7 Cluster

2.3. HTTP Sessions

An HTTP session is implicitly or explicitly created for an HTTP client and it is maintained until such time that it is either explicitly invalidated, or naturally times out. The existence of an HTTP session makes a web application stateful and leads to the requirement of an intelligent clustering solution.

Once a JBoss EAP 7 cluster is configured and started, a web application simply needs to declare itself as distributable to take advantage of the EAP 7 session replication capabilities. JBoss EAP 7 uses Infinispan to provide session replication. One available mode for this purpose is replication mode. A replicated cache replicates all session data across all nodes of the cluster asynchronously. This cache can be created in the web cache container and configured in various ways. The replication mode is a simple and traditional clustering solution to avoid a single point of failure and allow requests to seamlessly transition to another node of the cluster. It works by simply creating and maintaining a copy of each session in all other nodes of the cluster.

Assuming a cluster of three nodes, with an average of one session per node, sessions A, B and C are created on all nodes; the following diagram depicts the effect of session replication in such a scenario:

Figure 2.2. HTTP Session Clustering, Replication

HTTP Session Clustering with Replication

The replication approach works best in small clusters. As the number of users increases and the size of the HTTP session grows, keeping a copy of every active session on every single node means that horizontal scaling can no longer counter the increasing memory requirement.

For a larger cluster of four nodes with two sessions created on each node, there are eight sessions stored on every node:

Figure 2.3. HTTP Session Replication in Larger Clusters

HTTP Session Replication in Larger Clusters

In such scenarios, the preferred strategy is the distribution mode. Under the distribution mode, Infinispan allows the cluster to scale linearly as more servers are added. Each piece of data is copied to a number of other nodes, but this is different from the buddy replication system used in older versions of JBoss EAP, as the nodes in question are not statically designated and proven grid algorithms are used to scale the cluster more effectively.

The web cache container is configured with a default distributed cache called dist, which is initially set up with 2 owners. The number of owners determines the total number of nodes that will contain each data item, so an owners count of 2, results in one backup copy of the HTTP session.

Sticky load balancing behavior results in all HTTP requests being directed to the original owner of the session; if that server fails, a new session owner is designated and the load balancer is automatically associated with the new host, so that a remote call results in a transparent redirection to a node that now contains the session data and new backup data is created on the remaining servers.

Reproducing the previous example of a larger cluster of four nodes, with two sessions created on each node, there would only be four sessions stored on every node with distribution:

Figure 2.4. HTTP Session Distribution in Larger Clusters

HTTP Session Distribution in Larger Clusters
Note

Refer to Red Hat JBoss EAP 7 documentation for details on configuring the web clustering mode, including the number of copies in the distribution mode.

Administrators can also specify a limit for the number of currently active HTTP sessions and result in the passivation of some sessions, to make room for new ones, when this limit is reached. Passivation and subsequent activation follows the Least Recently Used (LRU) algorithm. The maximum number of active sessions or the idle time before passivation occurs can be configured through CLI, as described in the JBoss EAP documentation.

2.4. Stateless Session Beans

By definition, a stateless session bean avoids holding data on behalf of its client. This makes it easier to cluster stateless sessions beans, and removes any concern about data loss resulting from failure. Contrary to stateful HTTP conversations and stateful session beans, the lack of state in a stateless bean means that sequential calls made by the same client through the same stub can be load balanced across available cluster nodes. This is in fact the default behavior of stateless session beans in JBoss EAP 7, when the client stub is aware of the cluster topology. Such awareness can either be achieved by designating the bean as clustered or through EJB client configuration that lists all server nodes.

The JNDI lookup of a stateless session bean returns an intelligent stub, which has knowledge of the cluster topology and can successfully load balance or fail over a request across the available cluster nodes. This cluster information is updated with each subsequent call so the stub has the latest available information about the active nodes of the cluster.

2.5. Stateful Session Beans

With a stateful session bean, the container dedicates an instance of the bean to each client stub. The sequence of calls made through the stub are treated as a conversation and the state of the bean is maintained on the server side, so that each call potentially affects the result of a subsequent call. Once again, the stub of the stateful session bean, returned to a caller through a JNDI lookup, is an intelligent stub with knowledge of the cluster. However, this time the stub treats the conversation as a sticky session and routes all requests to the same cluster node, unless and until that node fails. Much like HTTP session replication, JBoss EAP 7 uses an Infinispan cache to hold the state of the stateful session bean and enable failover in case the active node crashes. Once again, only a distributed cache is preconfigured under the ejb cache container and set up as the default option in JBoss EAP 7. The ejb cache container can be independently configured to use a new cache that is set up by an administrator.

2.6. Transaction Subsystem

A transaction consists of two or more actions which must either all succeed or all fail. A successful outcome is a commit, and a failed outcome is a roll-back. In a roll-back, each member’s state is reverted to its state before the transaction attempted to commit. The typical standard for a well-designed transaction is that it is Atomic, Consistent, Isolated, and Durable (ACID). Red Hat JBoss EAP 7 defaults to using the Java Transaction API (JTA) to handle transactions. Java Transaction Service (JTS) is a mapping of the Object Transaction Service (OTS) to Java.

Note

When a node fails within a transaction, recovery of that transaction will be attempted only when that node is restarted again

While common storage may be used among EAP cluster nodes, the transaction log store of each server cannot be distributed and each node maintains its own object store. The node-identifier attribute of the transaction subsystem identifies avoids conflict between nodes by clearly associating transactions with their owner nodes and avoiding potential conflict. This attribute is then used as the basis of the unique transaction identifier. Refer to the Red Hat documentation on configuring the transaction manager for further details.

2.7. Java Persistence API (JPA)

The Java Persistence API (JPA) is the standard for using persistence in Java projects. Java EE 7 applications use the Java Persistence 2.1 specification. Hibernate EntityManager implements the programming interfaces and life-cycle rules defined by the specification. It provides JBoss EAP 7 with a complete Java persistence solution. JBoss EAP 7 is 100% compliant with the Java Persistence 2.1 specification. Hibernate also provides additional features to the specification.

When using JPA in a cluster, the state is typically stored in a central database or a cluster of databases, separate and independent of the JBoss EAP cluster. As such, requests may go through JPA beans on any node of the cluster and failure of an JBoss EAP instance does not affect the data, where JPA is concerned.

First-level caching in JPA relates to the persistence context, and in JBoss EAP 7, the hibernate session. This cache is local, short-lived and bound to the transaction. As long as the transaction is handled by the server, there is no concern in terms of horizontal scaling and clustering of JPA. However, in instances where the calling client controls the transaction and invokes EJBs resulting in multiple calls to the same JPA entities, there is a risk that different nodes of an EAP cluster would attempt to update and then read the same data, resulting in stale reads from the 1L cache. This can be avoided by ensuring that all such calls are directed to the same node. For further information on this issue, refer to the following Red Hat knowledgebase article.

JPA second-level caching refers to the more traditional database cache. It is a local data store of JPA entities that improves performance by reducing the number of required roundtrips to the database. With the use of second-level caching, it is understood that the data in the database may only be modified through JPA and only within the same cluster. Any change in data through other means may leave the cache stale, and the system subject to data inconsistency. However, even within the cluster, the second-level cache suddenly introduces a stateful aspect to JPA that must be addressed.

Red Hat JBoss EAP 7 uses Infinispan for second-level caching. This cache is set up by default and uses an invalidation-cache called entity within the hibernate cache container of the server profile.

To configure an application to take advantage of this cache, the value of hibernate.cache.use_second_level_cache needs to be set to true in the persistence XML file. In JBoss EAP, the Infinispan cache manager is associated with JPA by default and does not need to be configured.

Once a persistence unit is configured to use second-level caching, it is the shared-cache-mode property in the application’s persistence configuration file that determines which entities get cached. Possible values for this property include:

  • ENABLE_SELECTIVE to only cache entities explicitly marked as cacheable
  • DISABLE_SELECTIVE to cache all entities, unless explicitly excluded from caching
  • ALL to cache all entities, regardless of their individual annotation
  • NONE to not cache any entity, regardless of its annotation

Individual Entity classes may be marked with @Cacheable(true) or @Cacheable(false) to explicitly request caching or exclusion from caching.

The entity cache is an invalidation cache, which means that entities are cached on each node only when they are loaded on that node. Once an entity is changed on one node, an invalidation message is broadcast to the cluster to invalidate and remove all cached instances of this entity on any node of the cluster. That results in the stale version of the entity being avoided on other nodes and an updated copy being loaded and cached on any node that requires it, at the time when it’s needed.

Figure 2.5. JPA Second-Level Cache with Invalidation

JPA Second-Level Cache with Invalidation

2.8. ActiveMQ Artemis

The messaging broker in JBoss EAP 6 was called HornetQ, a JBoss community project. The HornetQ codebase was donated to the Apache ActiveMQ project, and the HornetQ community joined that project to enhance the donated codebase and create a next-generation messaging broker. The result is Apache ActiveMQ Artemis, the messaging broker for JBoss EAP 7, providing messaging consolidation and backwards compatibility with JBoss EAP 6.

JBoss EAP 7 uses Apache ActiveMQ Artemis as its JMS broker and is configured using the messaging-activemq subsystem. This fully replaces the HornetQ broker but retains protocol compatibility with JBoss EAP 6.

Default configuration for the messaging-activemq subsystem is included when starting the JBoss EAP server with the full or full-ha configuration. The full-ha option includes advanced configuration for features like clustering and high availability.

ActiveMQ’s HA feature allows JMS servers to be linked together as live - backup groups where each live server has a backup. Live servers receive messages from clients, while a backup server is not operational until failover occurs. A backup server can be owned by only one live server. In cases where there is more than one backup server in a backup group, only one will announce itself as being ready to take over the live server. This server will remain in passive mode, waiting to take over the live server’s work.

When a live server crashes or is brought down gracefully, the backup server currently in passive mode will become the new live server. If the new live server is configured to allow automatic failback, it will detect the old live server coming back up and automatically stop, allowing the old live server to start receiving messages again.

ActiveMQ Artemis supports two different HA strategies for backing up a server: data replication and shared store.

<server>
…​
<replication-master …​ />
OR
<replication-slave …​ />
OR
<shared-store-master …​ />
OR
<shared-store-slave …​ />
…​
<server>

Data Replication

When using replication, the live and the backup servers do not share the same data directories, all data synchronization is done over the network. Therefore all (persistent) data received by the live server will be duplicated to the backup.

If the live server is cleanly shutdown, the backup server will activate and clients will failover to backup. This behavior is pre-determined and is therefore not configurable when using data replication.

Upon startup the backup server will first need to synchronize all existing data from the live server before replacing it. Unlike shared storage, a replicating backup will not be fully operational immediately after startup. The time it will take for the synchronization to happen depends on the amount of data to be synchronized and the network speed.

Shared Store

This style of high availability differs from data replication in that it requires a shared file system which is accessible by both the live and backup nodes. This means that the servers use the same location for their paging, message journal, bindings journal, and large messages in their configuration.

Typically this shared file system will be some kind of high performance storage area network (SAN). Red Hat does not recommend using network-attached storage (NAS) for your storage solution.

The advantage of shared-store high availability is that no replication occurs between the live and backup nodes, this means it does not suffer any performance penalties due to the overhead of replication during normal operation.

The disadvantage of shared store replication is that when the backup server activates it needs to load the journal from the shared store which can take some time depending on the amount of data in the store. Also, it requires a shared storage solution supported by JBoss EAP.

2.9. HTTP Connectors

JBoss EAP 7 can take advantage of the load-balancing and high-availability mechanisms built into external web servers, such as Apache Web Server, Microsoft IIS, and Oracle iPlanet, as well as through Undertow. JBoss EAP 7 communicates with an external web server using an HTTP Connector. These HTTP connectors are configured within the web subsystem of JBoss EAP 7. Web Servers include software modules which control the way HTTP requests are routed to JBoss EAP 7 worker nodes. Each of these modules works differently and has its own configuration method. Modules may be configured to balance work loads across multiple JBoss EAP 7 server nodes, move work loads to alternate servers in case of a failure event, or do both.

JBoss EAP 7 supports several different HTTP connectors. The one you choose depends on both the Web Server you connect to and the functionality you require. The table below lists the differences between various available HTTP connectors, all compatible with JBoss EAP 7. For the most up-to-date information about supported HTTP connectors, refer to the official Red Hat documentation.

Table 2.1. HTTP Connectors

ConnectorWeb ServerSupported Operating SystemsSupported Protocols

mod_cluster

Apache HTTP Server in Red Hat JBoss Web Server, Red Hat JBoss Core Services

Red Hat Enterprise Linux®, Microsoft Windows Server, Oracle Solaris

HTTP, HTTPS, AJP

mod_jk

Apache HTTP Server in JBoss Web Server, JBoss Core Services

Red Hat Enterprise Linux®, Microsoft Windows Server, Oracle Solaris

AJP

mod_proxy

Apache HTTP Server in JBoss Web Server, JBoss Core Services

Red Hat Enterprise Linux®, Microsoft Windows Server, Oracle Solaris

HTTP, HTTPS, AJP

ISAPI connector

Microsoft IIS

Microsoft Windows Server

AJP

NSAPI connector

Oracle iPlanet Web Server

Oracle Solaris

AJP

2.9.1. mod_cluster

mod_cluster is an HTTP load balancer that provides a higher level of intelligence and control over web applications, beyond what is available with other HTTP load balancers. Using a special communication layer between the JBoss application server and the web server, mod_cluster can not only register when a web context is enabled, but also when it is disabled and removed from load balancing. This allows mod_cluster to handle full web application life cycles. With mod_cluster, the load of a given node is determined by the node itself. This allows load balancing using a wider range of metrics including CPU load, heap usage and other factors. It also makes it possible to use a custom load metric to determine the desired load balancing effect. mod_cluster has two modules: one for the web server, which handles routing and load balancing, and one for the JBoss application server to manage the web application contexts. Both modules must be installed and configured for the cluster to function. The mod_cluster module on JBoss EAP is available by default but may be further configured to auto-discover the proxy through multicast (advertise) or contact it directly through IP address and port.