Chapter 6. Hibernate Envers

6.1. About Hibernate Envers

Hibernate Envers is an auditing and versioning system, providing JBoss EAP with a means to track historical changes to persistent classes. Audit tables are created for entities annotated with @Audited, which store the history of changes made to the entity. The data can then be retrieved and queried.

Envers allows developers to:

  • audit all mappings defined by the JPA specification
  • audit all hibernate mappings that extend the JPA specification
  • audit entities mapped by or using the native Hibernate API
  • log data for each revision using a revision entity
  • query historical data

6.2. About Auditing Persistent Classes

Auditing of persistent classes is done in JBoss EAP through Hibernate Envers and the @Audited annotation. When the annotation is applied to a class, a table is created, which stores the revision history of the entity.

Each time a change is made to the class, an entry is added to the audit table. The entry contains the changes to the class, and is given a revision number. This means that changes can be rolled back, or previous revisions can be viewed.

6.3. Auditing Strategies

6.3.1. About Auditing Strategies

Auditing strategies define how audit information is persisted, queried and stored. There are currently two audit strategies available with Hibernate Envers:

Default Audit Strategy
  • This strategy persists the audit data together with a start revision. For each row that is inserted, updated or deleted in an audited table, one or more rows are inserted in the audit tables, along with the start revision of its validity.
  • Rows in the audit tables are never updated after insertion. Queries of audit information use subqueries to select the applicable rows in the audit tables, which are slow and difficult to index.
Validity Audit Strategy
  • This strategy stores the start revision, as well as the end revision of the audit information. For each row that is inserted, updated or deleted in an audited table, one or more rows are inserted in the audit tables, along with the start revision of its validity.
  • At the same time, the end revision field of the previous audit rows (if available) is set to this revision. Queries on the audit information can then use between start and end revision, instead of subqueries. This means that persisting audit information is a little slower because of the extra updates, but retrieving audit information is a lot faster.
  • This can also be improved by adding extra indexes.

For more information on auditing, see About Auditing Persistent Classes. To set the auditing strategy for the application, see Set the Auditing Strategy.

6.3.2. Set the Auditing Strategy

There are two audit strategies supported by JBoss EAP:

  • The default audit strategy
  • The validity audit strategy
Define an Auditing Strategy

Configure the org.hibernate.envers.audit_strategy property in the persistence.xml file of the application. If the property is not set in the persistence.xml file, then the default audit strategy is used.

Set the Default Audit Strategy

<property name="org.hibernate.envers.audit_strategy" value="org.hibernate.envers.strategy.DefaultAuditStrategy"/>

Set the Validity Audit Strategy

<property name="org.hibernate.envers.audit_strategy" value="org.hibernate.envers.strategy.ValidityAuditStrategy"/>

6.3.3. Adding Auditing Support to a JPA Entity

JBoss EAP uses entity auditing, through About Hibernate Envers, to track the historical changes of a persistent class. This section covers adding auditing support for a JPA entity.

Add Auditing Support to a JPA Entity

  1. Configure the available auditing parameters to suit the deployment. See Configure Envers Parameters for details.
  2. Open the JPA entity to be audited.
  3. Import the org.hibernate.envers.Audited interface.
  4. Apply the @Audited annotation to each field or property to be audited, or apply it once to the whole class.

    Example: Audit Two Fields

    import org.hibernate.envers.Audited;
    
    import javax.persistence.Entity;
    import javax.persistence.Id;
    import javax.persistence.GeneratedValue;
    import javax.persistence.Column;
    
    @Entity
    public class Person {
        @Id
        @GeneratedValue
        private int id;
    
        @Audited
        private String name;
    
        private String surname;
    
        @ManyToOne
        @Audited
        private Address address;
    
        // add getters, setters, constructors, equals and hashCode here
    }

    Example: Audit an Entire Class

    import org.hibernate.envers.Audited;
    
    import javax.persistence.Entity;
    import javax.persistence.Id;
    import javax.persistence.GeneratedValue;
    import javax.persistence.Column;
    
    @Entity
    @Audited
    public class Person {
        @Id
        @GeneratedValue
        private int id;
    
        private String name;
    
        private String surname;
    
        @ManyToOne
        private Address address;
    
        // add getters, setters, constructors, equals and hashCode here
    }

Once the JPA entity has been configured for auditing, a table called _AUD will be created to store the historical changes.

6.4. Configuration

6.4.1. Configure Envers Parameters

JBoss EAP uses entity auditing, through Hibernate Envers, to track the historical changes of a persistent class.

Configuring the Available Envers Parameters

  1. Open the persistence.xml file for the application.
  2. Add, remove or configure Envers properties as required. For a list of available properties, see Envers Configuration Properties.

    Example: Envers Parameters

    <persistence-unit name="mypc">
      <description>Persistence Unit.</description>
      <jta-data-source>java:jboss/datasources/ExampleDS</jta-data-source>
      <shared-cache-mode>ENABLE_SELECTIVE</shared-cache-mode>
      <properties>
        <property name="hibernate.hbm2ddl.auto" value="create-drop" />
        <property name="hibernate.show_sql" value="true" />
        <property name="hibernate.cache.use_second_level_cache" value="true" />
        <property name="hibernate.cache.use_query_cache" value="true" />
        <property name="hibernate.generate_statistics" value="true" />
        <property name="org.hibernate.envers.versionsTableSuffix" value="_V" />
        <property name="org.hibernate.envers.revisionFieldName" value="ver_rev" />
      </properties>
    </persistence-unit>

6.4.2. Enable or Disable Auditing at Runtime

Enable or Disable Entity Version Auditing at Runtime

  1. Subclass the AuditEventListener class.
  2. Override the following methods that are called on Hibernate events:

    • onPostInsert
    • onPostUpdate
    • onPostDelete
    • onPreUpdateCollection
    • onPreRemoveCollection
    • onPostRecreateCollection
  3. Specify the subclass as the listener for the events.
  4. Determine if the change should be audited.
  5. Pass the call to the superclass if the change should be audited.

6.4.3. Configure Conditional Auditing

Hibernate Envers persists audit data in reaction to various Hibernate events, using a series of event listeners. These listeners are registered automatically if the Envers JAR is in the class path.

Implement Conditional Auditing

  1. Set the hibernate.listeners.envers.autoRegister Hibernate property to false in the persistence.xml file.
  2. Subclass each event listener to be overridden. Place the conditional auditing logic in the subclass, and call the super method if auditing should be performed.
  3. Create a custom implementation of org.hibernate.integrator.spi.Integrator, similar to org.hibernate.envers.event.EnversIntegrator. Use the event listener subclasses created in step two, rather than the default classes.
  4. Add a META-INF/services/org.hibernate.integrator.spi.Integrator file to the JAR. This file should contain the fully qualified name of the class implementing the interface.

6.4.4. Envers Configuration Properties

Table 6.1. Entity Data Versioning Configuration Parameters

Property NameDefault ValueDescription

org.hibernate.envers.audit_table_prefix

 

A string that is prepended to the name of an audited entity, to create the name of the entity that will hold the audit information.

org.hibernate.envers.audit_table_suffix

_AUD

A string that is appended to the name of an audited entity to create the name of the entity that will hold the audit information. For example, if an entity with a table name of Person is audited, Envers will generate a table called Person_AUD to store the historical data.

org.hibernate.envers.revision_field_name

REV

The name of the field in the audit entity that holds the revision number.

org.hibernate.envers.revision_type_field_name

REVTYPE

The name of the field in the audit entity that holds the type of revision. The current types of revisions possible are: add, mod and del for inserting, modifying or deleting respectively.

org.hibernate.envers.revision_on_collection_change

true

This property determines if a revision should be generated if a relation field that is not owned changes. This can either be a collection in a one-to-many relation, or the field using the mappedBy attribute in a one-to-one relation.

org.hibernate.envers.do_not_audit_optimistic_locking_field

true

When true, properties used for optimistic locking (annotated with @Version) will automatically be excluded from auditing.

org.hibernate.envers.store_data_at_delete

false

This property defines whether or not entity data should be stored in the revision when the entity is deleted, instead of only the ID, with all other properties marked as null. This is not usually necessary, as the data is present in the last-but-one revision. Sometimes, however, it is easier and more efficient to access it in the last revision. However, this means the data the entity contained before deletion is stored twice.

org.hibernate.envers.default_schema

null (same as normal tables)

The default schema name used for audit tables. Can be overridden using the @AuditTable(schema="…​") annotation. If not present, the schema will be the same as the schema of the normal tables.

org.hibernate.envers.default_catalog

null (same as normal tables)

The default catalog name that should be used for audit tables. Can be overridden using the @AuditTable(catalog="…​") annotation. If not present, the catalog will be the same as the catalog of the normal tables.

org.hibernate.envers.audit_strategy

org.hibernate.envers.strategy.DefaultAuditStrategy

This property defines the audit strategy that should be used when persisting audit data. By default, only the revision where an entity was modified is stored. Alternatively, org.hibernate.envers.strategy.ValidityAuditStrategy stores both the start revision and the end revision. Together, these define when an audit row was valid.

org.hibernate.envers.audit_strategy_validity_end_rev_field_name

REVEND

The column name that will hold the end revision number in audit entities. This property is only valid if the validity audit strategy is used.

org.hibernate.envers.audit_strategy_validity_store_revend_timestamp

false

This property defines whether the timestamp of the end revision, where the data was last valid, should be stored in addition to the end revision itself. This is useful to be able to purge old audit records out of a relational database by using table partitioning. Partitioning requires a column that exists within the table. This property is only evaluated if the ValidityAuditStrategy is used.

org.hibernate.envers.audit_strategy_validity_revend_timestamp_field_name

REVEND_TSTMP

Column name of the timestamp of the end revision at which point the data was still valid. Only used if the ValidityAuditStrategy is used, and org.hibernate.envers.audit_strategy_validity_store_revend_timestamp evaluates to true.

6.5. Querying Audit Information

6.5.1. Retrieve Auditing Information Through Queries

Hibernate Envers provides the functionality to retrieve audit information through queries.

Note

Queries on the audited data will be, in many cases, much slower than corresponding queries on live data, as they involve correlated subselects.

Querying for Entities of a Class at a Given Revision

The entry point for this type of query is:

AuditQuery query = getAuditReader()
    .createQuery()
    .forEntitiesAtRevision(MyEntity.class, revisionNumber);

Constraints can then be specified, using the AuditEntity factory class. The query below only selects entities where the name property is equal to John:

query.add(AuditEntity.property("name").eq("John"));

The queries below only select entities that are related to a given entity:

query.add(AuditEntity.property("address").eq(relatedEntityInstance));
// or
query.add(AuditEntity.relatedId("address").eq(relatedEntityId));

The results can then be ordered, limited, and have aggregations and projections (except grouping) set. The example below is a full query.

List personsAtAddress = getAuditReader().createQuery()
    .forEntitiesAtRevision(Person.class, 12)
    .addOrder(AuditEntity.property("surname").desc())
    .add(AuditEntity.relatedId("address").eq(addressId))
    .setFirstResult(4)
    .setMaxResults(2)
    .getResultList();

Query Revisions where Entities of a Given Class Changed

The entry point for this type of query is:

AuditQuery query = getAuditReader().createQuery()
    .forRevisionsOfEntity(MyEntity.class, false, true);

Constraints can be added to this query in the same way as the previous example. There are additional possibilities for this query:

AuditEntity.revisionNumber()
Specify constraints, projections and order on the revision number in which the audited entity was modified.
AuditEntity.revisionProperty(propertyName)
Specify constraints, projections and order on a property of the revision entity, corresponding to the revision in which the audited entity was modified.
AuditEntity.revisionType()
Provides accesses to the type of the revision (ADD, MOD, DEL).

The query results can then be adjusted as necessary. The query below selects the smallest revision number at which the entity of the MyEntity class, with the entityId ID has changed, after revision number 42:

Number revision = (Number) getAuditReader().createQuery()
    .forRevisionsOfEntity(MyEntity.class, false, true)
    .setProjection(AuditEntity.revisionNumber().min())
    .add(AuditEntity.id().eq(entityId))
    .add(AuditEntity.revisionNumber().gt(42))
    .getSingleResult();

Queries for revisions can also minimize/maximize a property. The query below selects the revision at which the value of the actualDate for a given entity was larger than a given value, but as small as possible:

Number revision = (Number) getAuditReader().createQuery()
    .forRevisionsOfEntity(MyEntity.class, false, true)
    // We are only interested in the first revision
    .setProjection(AuditEntity.revisionNumber().min())
    .add(AuditEntity.property("actualDate").minimize()
        .add(AuditEntity.property("actualDate").ge(givenDate))
        .add(AuditEntity.id().eq(givenEntityId)))
    .getSingleResult();

The minimize() and maximize() methods return a criteria, to which constraints can be added, which must be met by the entities with the maximized/minimized properties.

There are two boolean parameters passed when creating the query.

selectEntitiesOnly

This parameter is only valid when an explicit projection is not set.
If true, the result of the query will be a list of entities that changed at revisions satisfying the specified constraints.
If false, the result will be a list of three element arrays. The first element will be the changed entity instance. The second will be an entity containing revision data. If no custom entity is used, this will be an instance of DefaultRevisionEntity. The third element array will be the type of the revision (ADD, MOD, DEL).
selectDeletedEntities
This parameter specifies if revisions in which the entity was deleted must be included in the results. If true, the entities will have the revision type DEL, and all fields, except id, will have the value null.

Query Revisions of an Entity that Modified a Given Property

The query below will return all revisions of MyEntity with a given id, where the actualDate property has been changed.

AuditQuery query = getAuditReader().createQuery()
  .forRevisionsOfEntity(MyEntity.class, false, true)
  .add(AuditEntity.id().eq(id));
  .add(AuditEntity.property("actualDate").hasChanged())

The hasChanged condition can be combined with additional criteria. The query below will return a horizontal slice for MyEntity at the time the revisionNumber was generated. It will be limited to the revisions that modified prop1, but not prop2.

AuditQuery query = getAuditReader().createQuery()
  .forEntitiesAtRevision(MyEntity.class, revisionNumber)
  .add(AuditEntity.property("prop1").hasChanged())
  .add(AuditEntity.property("prop2").hasNotChanged());

The result set will also contain revisions with numbers lower than the revisionNumber. This means that this query cannot be read as "Return all MyEntities changed in revisionNumber with prop1 modified and prop2 untouched."

The query below shows how this result can be returned, using the forEntitiesModifiedAtRevision query:

AuditQuery query = getAuditReader().createQuery()
  .forEntitiesModifiedAtRevision(MyEntity.class, revisionNumber)
  .add(AuditEntity.property("prop1").hasChanged())
  .add(AuditEntity.property("prop2").hasNotChanged());

Query Entities Modified in a Given Revision

The example below shows the basic query for entities modified in a given revision. It allows entity names and corresponding Java classes changed in a specified revision to be retrieved:

Set<Pair<String, Class>> modifiedEntityTypes = getAuditReader()
    .getCrossTypeRevisionChangesReader().findEntityTypes(revisionNumber);

There are a number of other queries that are also accessible from org.hibernate.envers.CrossTypeRevisionChangesReader:

List<Object> findEntities(Number)
Returns snapshots of all audited entities changed (added, updated and removed) in a given revision. Executes n+1 SQL queries, where n is a number of different entity classes modified within the specified revision.
List<Object> findEntities(Number, RevisionType)
Returns snapshots of all audited entities changed (added, updated or removed) in a given revision filtered by modification type. Executes n+1 SQL queries, where n is a number of different entity classes modified within specified revision. Map<RevisionType, List<Object>>
findEntitiesGroupByRevisionType(Number)
Returns a map containing lists of entity snapshots grouped by modification operation, for example, addition, update or removal. Executes 3n+1 SQL queries, where n is a number of different entity classes modified within specified revision.

6.5.2. Traversing Entity Associations Using Properties of Referenced Entities

You can use the properties of a referenced entity to traverse entities in a query. This enables you to query for one-to-one and many-to-one associations.

The examples below demonstrate some of the ways you can traverse entities in a query.

  • In revision number 1, find cars where the owner is age 20 or lives at address number 30, then order the result set by car make.

    List<Car> resultList = auditReader.createQuery()
                    .forEntitiesAtRevision( Car.class, 1 )
                    .traverseRelation( "owner", JoinType.INNER, "p" )
                    .traverseRelation( "address", JoinType.INNER, "a" )
                    .up().up().add( AuditEntity.disjunction().add(AuditEntity.property( "p", "age" )
                           .eq( 20 ) ).add( AuditEntity.property( "a", "number" ).eq( 30 ) ) )
                    .addOrder( AuditEntity.property( "make" ).asc() ).getResultList();
  • In revision number 1, find the car where the owner age is equal to the owner address number.

    Car result = (Car) auditReader.createQuery()
                    .forEntitiesAtRevision( Car.class, 1 )
                    .traverseRelation( "owner", JoinType.INNER, "p" )
                    .traverseRelation( "address", JoinType.INNER, "a" )
                    .up().up().add(AuditEntity.property( "p", "age" )
                            .eqProperty( "a", "number" ) ).getSingleResult();
  • In revision number 1, find all cars where the owner is age 20 or where there is no owner.

    List<Car> resultList = auditReader.createQuery()
                    .forEntitiesAtRevision( Car.class, 1 )
                    .traverseRelation( "owner", JoinType.LEFT, "p" )
                    .up().add( AuditEntity.or( AuditEntity.property( "p", "age").eq( 20 ),
                            AuditEntity.relatedId( "owner" ).eq( null ) ) )
                    .addOrder( AuditEntity.property( "make" ).asc() ).getResultList();
  • In revision number 1, find all cars where the make equals "car3", and where the owner is age 30 or there is no no owner.

    List<Car> resultList = auditReader.createQuery()
                    .forEntitiesAtRevision( Car.class, 1 )
                    .traverseRelation( "owner", JoinType.LEFT, "p" )
                    .up().add( AuditEntity.and( AuditEntity.property( "make" ).eq( "car3" ), AuditEntity.property( "p", "age" ).eq( 30 ) ) )
                    .getResultList();
  • In revision number 1, find all cars where the make equals "car3" or where or the owner is age 10 or where there is no owner.

    List<Car> resultList = auditReader.createQuery()
                    .forEntitiesAtRevision( Car.class, 1 )
                    .traverseRelation( "owner", JoinType.LEFT, "p" )
                    .up().add( AuditEntity.or( AuditEntity.property( "make" ).eq( "car3" ), AuditEntity.property( "p", "age" ).eq( 10 ) ) )
                    .getResultList();

6.6. Performance Tuning

6.6.1. Alternative Batch Loading Algorithms

Hibernate allows you to load data for associations using one of four fetching strategies: join, select, subselect and batch. Out of these four strategies, batch loading allows for the biggest performance gains as it is an optimization strategy for select fetching. In this strategy, Hibernate retrieves a batch of entity instances or collections in a single SELECT statement by specifying a list of primary or foreign keys. Batch fetching is an optimization of the lazy select fetching strategy.

There are two ways to configure batch fetching: per-class level or per-collection level.

  • Per-class Level

    When Hibernate loads data on a per-class level, it requires the batch size of the association to pre-load when queried. For example, consider that at runtime you have 30 instances of a car object loaded in session. Each car object belongs to an owner object. If you were to iterate through all the car objects and request their owners, with lazy loading, Hibernate will issue 30 select statements - one for each owner. This is a performance bottleneck.

    You can instead, tell Hibernate to pre-load the data for the next batch of owners before they have been sought via a query. When an owner object has been queried, Hibernate will query many more of these objects in the same SELECT statement.

    The number of owner objects to query in advance depends upon the batch-size parameter specified at configuration time:

    <class name="owner" batch-size="10"></class>

    This tells Hibernate to query at least 10 more owner objects in expectation of them being needed in the near future. When a user queries the owner of car A, the owner of car B may already have been loaded as part of batch loading. When the user actually needs the owner of car B, instead of going to the database (and issuing a SELECT statement), the value can be retrieved from the current session.

    In addition to the batch-size parameter, Hibernate 4.2.0 has introduced a new configuration item to improve in batch loading performance. The configuration item is called Batch Fetch Style configuration and specified by the hibernate.batch_fetch_style parameter.

    Three different batch fetch styles are supported: LEGACY, PADDED and DYNAMIC. To specify which style to use, use org.hibernate.cfg.AvailableSettings#BATCH_FETCH_STYLE.

    • LEGACY: In the legacy style of loading, a set of pre-built batch sizes based on ArrayHelper.getBatchSizes(int) are utilized. Batches are loaded using the next-smaller pre-built batch size from the number of existing batchable identifiers.

      Continuing with the above example, with a batch-size setting of 30, the pre-built batch sizes would be [30, 15, 10, 9, 8, 7, .., 1]. An attempt to batch load 29 identifiers would result in batches of 15, 10, and 4. There will be 3 corresponding SQL queries, each loading 15, 10 and 4 owners from the database.

    • PADDED - Padded is similar to LEGACY style of batch loading. It still utilizes pre-built batch sizes, but uses the next-bigger batch size and pads the extra identifier placeholders.

      As with the example above, if 30 owner objects are to be initialized, there will only be one query executed against the database.

      However, if 29 owner objects are to be initialized, Hibernate will still execute only one SQL select statement of batch size 30, with the extra space padded with a repeated identifier.

    • Dynamic - While still conforming to batch-size restrictions, this style of batch loading dynamically builds its SQL SELECT statement using the actual number of objects to be loaded.

      For example, for 30 owner objects, and a maximum batch size of 30, a call to retrieve 30 owner objects will result in one SQL SELECT statement. A call to retrieve 35 will result in two SQL statements, of batch sizes 30 and 5 respectively. Hibernate will dynamically alter the second SQL statement to keep at 5, the required number, while still remaining under the restriction of 30 as the batch-size. This is different to the PADDED version, as the second SQL will not get PADDED, and unlike the LEGACY style, there is no fixed size for the second SQL statement - the second SQL is created dynamically.

      For a query of less than 30 identifiers, this style will dynamically only load the number of identifiers requested.

  • Per-collection Level

    Hibernate can also batch load collections honoring the batch fetch size and styles as listed in the per-class section above.

    To reverse the example used in the previous section, consider that you need to load all the car objects owned by each owner object. If 10 owner objects are loaded in the current session iterating through all owners will generate 10 SELECT statements, one for every call to getCars() method. If you enable batch fetching for the cars collection in the mapping of Owner, Hibernate can pre-fetch these collections, as shown below.

    <class name="Owner"><set name="cars" batch-size="5"></set></class>

    Thus, with a batch size of five and using legacy batch style to load 10 collections, Hibernate will execute two SELECT statements, each retrieving five collections.

6.6.2. Second Level Caching of Object References for Non-mutable Data

Hibernate automatically caches data within memory for improved performance. This is accomplished by an in-memory cache which reduces the number of times that database lookups are required, especially for data that rarely changes.

Hibernate maintains two types of caches. The primary cache, also called the first-level cache, is mandatory. This cache is associated with the current session and all requests must pass through it. The secondary cache, also called the second-level cache, is optional, and is only consulted after the primary cache has been consulted.

Data is stored in the second-level cache by first disassembling it into a state array. This array is deep copied, and that deep copy is put into the cache. The reverse is done for reading from the cache. This works well for data that changes (mutable data), but is inefficient for immutable data.

Deep copying data is an expensive operation in terms of memory usage and processing speed. For large data sets, memory and processing speed become a performance-limiting factor. Hibernate allows you to specify that immutable data be referenced rather than copied. Instead of copying entire data sets, Hibernate can now store the reference to the data in the cache.

This can be done by changing the value of the configuration setting hibernate.cache.use_reference_entries to true. By default, hibernate.cache.use_reference_entries is set to false.

When hibernate.cache.use_reference_entries is set to true, an immutable data object that does not have any associations is not copied into the second-level cache, and only a reference to it is stored.

Warning

When hibernate.cache.use_reference_entries is set to true, immutable data objects with associations are still deep copied into the second-level cache.