Chapter 9. Fault Tolerance and Reliability

9.1. System Reliability

There are many components and services within the JBoss Enterprise SOA Platform. The failure of some of them may go unnoticed to some or all of your applications depending upon when the failure occurs. For example, if the registry service crashes after your consumer has successfully obtained all the EPR information for the services it needs to function, the crash will have no adverse affect on your application. However, if it fails before this point, your application will not be able to progress. Therefore, in any determination of reliability guarantees, it is necessary to consider when failures occur and what type of failures they could be.
It is never possible to guarantee total reliability and fault tolerance. Hardware failure and human error is inevitable. However, you can ensure that a system will generally tolerate failures, maintain data consistency and make forward progress. Fault-tolerance techniques, such as transactions or replication, always comes at a cost to performance. This trade-off between performance and fault-tolerance is best achieved with knowledge of the application. Attempting to uniformly impose a specific approach to all applications inevitably leads to poorer performance in situations where it was not necessary. As such, you will find that many of the fault-tolerance techniques supported by the JBoss Enterprise SOA Platform are disabled by default. You should enable them only when needed.