Red Hat Training

A Red Hat training course is available for Red Hat JBoss Enterprise Application Platform

13.8. Performance Tuning

13.8.1. Alternative Batch Loading Algorithms

Hibernate allows you to load data for associations using one of four fetching strategies: join, select, subselect and batch. Out of these four strategies, batch loading allows for the biggest performance gains as it is an optimization strategy for select fetching. In this strategy, Hibernate retrieves a batch of entity instances or collections in a single SELECT statement by specifying a list of primary or foreign keys. Batch fetching is an optimization of the lazy select fetching strategy.
There are two ways to configure batch fetching: per-class level or per-collection level.
  • Per-Class Level
    When Hibernate loads data on a per-class level, it requires the batch size of the association to pre-load when queried. For example, consider that at runtime you have 30 instances of a car object loaded in session. Each car object belongs to an owner object. If you were to iterate through all the car objects and request their owners, with lazy loading, Hibernate will issue 30 select statements - one for each owner. This is a performance bottleneck.
    You can instead, tell Hibernate to pre-load the data for the next batch of owners before they have been sought via a query. When an owner object has been queried, Hibernate will query many more of these objects in the same SELECT statement.
    The number of owner objects to query in advance depends upon the batch-size parameter specified at configuration time:
    <class name="owner" batch-size="10"></class>
    This tells Hibernate to query at least 10 more owner objects in expectation of them being needed in the near future. When a user queries the owner of car A, the owner of car B may already have been loaded as part of batch loading. When the user actually needs the owner of car B, instead of going to the database (and issuing a SELECT statement), the value can be retrieved from the current session.
    In addition to the batch-size parameter, Hibernate 4.2.0 has introduced a new configuration item to improve in batch loading performance. The configuration item is called Batch Fetch Style configuration and specified by the hibernate.batch_fetch_style parameter.
    Three different batch fetch styles are supported: LEGACY, PADDED and DYNAMIC. To specify which style to use, use org.hibernate.cfg.AvailableSettings#BATCH_FETCH_STYLE.
    • LEGACY: In the legacy style of loading, a set of pre-built batch sizes based on ArrayHelper.getBatchSizes(int) are utilized. Batches are loaded using the next-smaller pre-built batch size from the number of existing batchable identifiers.
      Continuing with the above example, with a batch-size setting of 30, the pre-built batch sizes would be [30, 15, 10, 9, 8, 7, .., 1]. An attempt to batch load 29 identifiers would result in batches of 15, 10, and 4. There will be 3 corresponding SQL queries, each loading 15, 10 and 4 owners from the database.
    • PADDED - Padded is similar to LEGACY style of batch loading. It still utilizes pre-built batch sizes, but uses the next-bigger batch size and pads the extra identifier placeholders.
      As with the example above, if 30 owner objects are to be initialized, there will only be one query executed against the database.
      However, if 29 owner objects are to be initialized, Hibernate will still execute only 1 SQL select statement of batch size 30, with the extra space padded with a repeated identifier.
    • Dynamic - While still conforming to batch-size restrictions, this style of batch loading dynamically builds its SQL SELECT statement using the actual number of objects to be loaded.
      For example, for 30 owner objects, and a maximum batch size of 30, a call to retrieve 30 owner objects will result in one SQL SELECT statement. A call to retrieve 35 will result in two SQL statements, of batch sizes 30 and 5 respectively. Hibernate will dynamically alter the second SQL statement to keep at 5, the required number, while still remaining under the restriction of 30 as the batch-size. This is different to the PADDED version, as the second SQL will not get PADDED, and unlike the LEGACY style, there is no fixed size for the second SQL statement - the second SQL is created dynamically.
      For a query of less than 30 identifiers, this style will dynamically only load the number of identifiers requested.
  • Per-Collection Level
    Hibernate can also batch load collections honoring the batch fetch size and styles as listed in the per-class section above.
    To reverse the example used in the previous section, consider that you need to load all the car objects owned by each owner object. If 10 owner objects are loaded in the current session iterating through all owners will generate 10 SELECT statements, one for every call to getCars() method. If you enable batch fetching for the cars collection in the mapping of Owner, Hibernate can pre-fetch these collections, as shown below.
    <class name="Owner"><set name="cars" batch-size="5"></set></class>
    Thus, with a batch-size of 5 and using legacy batch style to load 10 collections, Hibernate will execute two SELECT statements, each retrieving 5 collections.

13.8.2. Second Level Caching of Object References for Non-mutable Data

Hibernate automatically caches data within memory for improved performance. This is accomplished by an in-memory cache which reduces the number of times that database lookups are required, especially for data that rarely changes.
Hibernate maintains two types of caches. The primary cache (also called the first-level cache) is mandatory. This cache is associated with the current session and all requests must pass through it. The secondary cache (also called the second-level cache) is optional, and is only consulted after the primary cache has been consulted first.
Data is stored in the second-level cache by first disassembling it into a state array. This array is deep copied, and that deep copy is put into the cache. The reverse is done for reading from the cache. This works well for data that changes (mutable data), but is inefficient for immutable data.
Deep copying data is an expensive operation in terms of memory usage and processing speed. For large data sets, memory and processing speed become a performance-limiting factor. Hibernate allows you to specify that immutable data be referenced rather than copied. Instead of copying entire data sets, Hibernate can now store the reference to the data in the cache.
This can be done by changing the value of the configuration setting hibernate.cache.use_reference_entries to true. By default, hibernate.cache.use_reference_entries is set to false.
When hibernate.cache.use_reference_entries is set to true, an immutable data object that does not have any associations is not copied into the second-level cache, and only a reference to it is stored.

Warning

When hibernate.cache.use_reference_entries is set to true, immutable data objects with associations are still deep copied into the second-level cache.