Language:
Format:

Language:
Format:

Red Hat Training

A Red Hat training course is available for Red Hat JBoss Enterprise Application Platform

Chapter 23. Hibernate Search

23.1. Getting Started with Hibernate Search

23.1.1. About Hibernate Search

Hibernate Search provides full-text search capability to Hibernate applications. It is especially suited to search applications for which SQL-based solutions are not suited, including: full-text, fuzzy and geolocation searches. Hibernate Search uses Apache Lucene as its full-text search engine, but is designed to minimize the maintenance overhead. Once it is configured, indexing, clustering and data synchronization is maintained transparently, allowing you to focus on meeting your business requirements.

Report a bug

23.1.2. Overview

Hibernate Search consists of an indexing component as well as an index search component, both are backed by Apache Lucene. Each time an entity is inserted, updated or removed in/from the database, Hibernate Search keeps track of this event (through the Hibernate event system) and schedules an index update. All these updates are handled without you having to interact with the Apache Lucene APIs directly. Instead, interaction with the underlying Lucene indexes is handled via an IndexManager.

Once the index is created, you can search for entities and return lists of managed entities instead of dealing with the underlying Lucene infrastructure. The same persistence context is shared between Hibernate and Hibernate Search. The FullTextSession class is built on top of the Hibernate Session class so that the application code can use the unified org.hibernate.Query or javax.persistence.Query APIs exactly the same way an HQL, JPA-QL, or native query would do.

Note

It is recommended - for both your database and Hibernate Search - to execute your operations in a transaction, be it JDBC or JTA.

Note

Hibernate Search works perfectly fine in the Hibernate / EntityManager long conversation pattern, known as atomic conversation.

Report a bug

23.1.3. About the Index Manager

Each time an entity is inserted, updated or removed from the database, Hibernate Search keeps track of this event through the Hibernate event system and schedules an index update. Interaction with the underlying Lucene indexes is handled by an IndexManager, each of which is uniquely identified by name. By default there is a one-to-one relationship between IndexManager and Lucene index. The IndexManager abstracts the specific index configuration, including the selected backend, reader strategy and the chosen DirectoryProvider.

Report a bug

23.1.4. About the Directory Provider

Apache Lucene, which is part of the Hibernate Search infrastructure, has the concept of a Directory for storage of indexes. Hibernate Search handles the initialization and configuration of a Lucene Directory instance via a Directory Provider.

The directory_provider property specifies the directory provider to be used to store the indexes. The default filesystem directory provider is filesystem, which uses the local filesystem to store indexes.

Report a bug

23.1.5. About the Worker

Updates to Lucene indexes are handled by the Hibernate Search Worker, which receives all entity changes, queues them by context and applies them once a context ends. The most common context is the transaction, but may be dependent on the number of entity changes or some other application (life cycle) events.

For better efficiency, interactions are batched and generally applied once the context ends. Outside a transaction, the index update operation is executed right after the actual database operation. In the case of an ongoing transaction, the index update operation is scheduled for the transaction commit phase and discarded in case of transaction rollback. A worker may be configured with a specific batch size limit, after which indexing occurs regardless of the context.

For details of Worker configuration options see Section 23.2.5, “Worker Configuration”.

There are two immediate benefits to this method of handling index updates:

Performance: Lucene indexing works better when operation are executed in batch.
ACIDity: The work executed has the same scoping as the one executed by the database transaction and is executed if and only if the transaction is committed. This is not ACID in the strict sense, but ACID behavior is rarely useful for full text search indexes since they can be rebuilt from the source at any time.

The two batch modes - no scope vs transactional - are the equivalent of autocommit versus transactional behavior. From a performance perspective, the transactional mode is recommended. The scoping choice is made transparently. Hibernate Search detects the presence of a transaction and adjust the scoping (see Section 23.2.5, “Worker Configuration”).

Report a bug

23.1.6. Back End Setup and Operations

23.1.6.1. Back End

Hibernate Search uses various back ends to process batches of work. The back end is not limited to the configuration option default.worker.backend. This property specifies a implementation of the BackendQueueProcessor interface which is a part of a back end configuration. Additional settings are required to set up a back end, for example the JMS back end.

Report a bug

23.1.6.2. Lucene

In the Lucene mode, all index updates for a node (JVM) are executed by the same node to the Lucene directories using the directory providers. Use this mode in a non-clustered environment or in clustered environments with a shared directory store.

Figure 23.1. Lucene Back End Configuration

Lucene mode targets non-clustered or clustered applications where the Directory manages the locking strategy. The primary advantage of Lucene mode is simplicity and immediate visibility of changes in Lucene queries. The Near Real Time (NRT) back end is an alternate back end for non-clustered and non-shared index configurations.

Report a bug

23.1.6.3. JMS

Index updates for a node are sent to the JMS queue. A unique reader processes the queue and updates the master index. The master index is subsequently replicated regularly to slave copies to establish the master/slave pattern. The master is responsible for Lucene index updates. The slaves accept read and write operations but process read operations on local index copies. The master is the sole responsible for updating the Lucene index. Only the master applies the local changes in an update operation.

Figure 23.2. JMS Backend Configuration

This mode targets clustered environments where throughput is critical and index update delays are affordable. The JMS provider ensures reliability and uses the slaves to change the local index copies.

Report a bug

23.1.7. Reader Strategies

When executing a query, Hibernate Search uses a reader strategy to interact with the Apache Lucene indexes. Choose a reader strategy based on the profile of the application (frequent updates, read mostly, asynchronous index update, etc).

Report a bug

23.1.7.1. The Shared Strategy

Using the shared strategy, Hibernate Search shares the same IndexReader for a given Lucene index across multiple queries and threads provided that the IndexReader remains updated. If the IndexReader is not updated, a new one is opened and provided. Each IndexReader is made of several SegmentReaders. The shared strategy reopens segments that have been modified or created after the last opening and shares the already loaded segments from the previous instance. This is the default strategy.

Report a bug

23.1.7.2. The Not-shared Strategy

Using the not-shared strategy, a Lucene IndexReader opens every time a query executes. Opening and starting up a IndexReader is an expensive operation. As a result, opening an IndexReader for each query execution is not an efficient strategy.

Report a bug

23.1.7.3. Custom Reader Strategies

You can write a custom reader strategy using an implementation of org.hibernate.search.reader.ReaderProvider. The implementation must be thread safe.

Report a bug

23.1.7.4. Reader Strategy Configuration

Change the strategy from the default (shared) to not-shared as follows:

hibernate.search.[default|<indexname>].reader.strategy = not-shared

Alternately, customize the reader strategy by replacing my.corp.myapp.CustomReaderProvider with the custom strategy implementation:

hibernate.search.[default|<indexname>].reader.strategy = my.corp.myapp.CustomReaderProvider

Report a bug

Select Your Language

Red Hat Training

Chapter 23. Hibernate Search

23.1. Getting Started with Hibernate Search

23.1.1. About Hibernate Search

23.1.2. Overview

23.1.3. About the Index Manager

23.1.4. About the Directory Provider

23.1.5. About the Worker

23.1.6. Back End Setup and Operations

23.1.6.1. Back End

23.1.6.2. Lucene

23.1.6.3. JMS

23.1.7. Reader Strategies

23.1.7.1. The Shared Strategy

23.1.7.2. The Not-shared Strategy

23.1.7.3. Custom Reader Strategies

23.1.7.4. Reader Strategy Configuration

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Language and Page Formatting Options

Red Hat Training

Chapter 23. Hibernate Search

23.1. Getting Started with Hibernate Search

23.1.1. About Hibernate Search

23.1.2. Overview

23.1.3. About the Index Manager

23.1.4. About the Directory Provider

23.1.5. About the Worker

23.1.6. Back End Setup and Operations

23.1.6.1. Back End

23.1.6.2. Lucene

23.1.6.3. JMS

23.1.7. Reader Strategies

23.1.7.1. The Shared Strategy

23.1.7.2. The Not-shared Strategy

23.1.7.3. Custom Reader Strategies

23.1.7.4. Reader Strategy Configuration

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links